CompTIA IT Fundamentals FC0-U61 – Understanding Databases

  1. Introduction to Databases

Databases are incredibly powerful and useful in modern computing. To new users, though, they can seem difficult or opaque. So let’s clear that picture right now, starting with what you know. We have a teacher, Eliza, who wants to keep track of her math students throughout the semester. She needs to record information such as student names, maybe demographic information like gender or race test scores, notes on improvements or discipline, and so on. By recording this information, she can track how the students are doing during the semester very quickly, seeing if Joy’s grades are slipping or if Billy’s acting up more than usual. How could she accomplish this goal of tracking her students? The traditional way is through a gradebook, right? A handwritten list of names, grades, and other information. In computer jargon, this grade book is called Flat File Storage.

Modern teachers create the same kind of list, but use a spreadsheet program like Microsoft Excel to record the information. And that works great for a single classroom, say, of 15 students in one subject for a semester. It even scales up. Eliza teaches four periods of math with a total of 60 students. She makes a spreadsheet with 60 rows, one for each student, and a number of columns that represent things to record last name, first name, quiz scores, exam scores, standardized test scores, and so on. Let’s take the school example up another level. Principal Martha Price wants to track the progress of all the students in her school. She has ten teachers in 15 subjects, like math, English, computer science it fundamentals, and so on. With about 50 students per subject. I’ll let the numbers sink in for a second.

Got it. The spreadsheet solution for each teacher will work, right? It might get a little messy, but Principal Price can get spreadsheets from all the teachers at the end of the semester, then create a master spreadsheet that has all the information she wants. This brute force method is how principals kept tracks of grades and progress over time forever, or at least since paper was inexpensive. But let’s go up another level. The state wants to assess student performance at every school, specifically on standardized tests.

The school district therefore needs to collect data from all the schools in the district. Will the spreadsheet method, even the brute force method employed by Principal Price, work? Once you scale up to a whole district, or worse, a whole state? No way. It would be a ton of work with lots of potentials for errors at every level. Once you need to scale up in data collection, a database becomes the preferred tool to use. A database, in a basic sense, functions like a giant spreadsheet. The underlying structure is a table, but it can scale up easily to handle much more than a spreadsheet. And by more, I mean a couple of things. First, databases can handle exceptionally complex information. Second, databases can handle input from many people, all at the same time.

Third, you can design a database to respond to questions regarding any combination of data stores. Those questions are called queries. Let’s go back to our example and see how it plays out in the school district where you’ll find principal price and her school. Principal price could have our it. Department create a centralized database that has entries for each student. Each student name would have its own row on the database table. Each row is a record. Each column in the table represents a type of data to record, such as last name, first name, Social Security, number, demographics, teacher subject, midterm grade finals, standardized test scores, and so on. Each place where a row and column intersect is a cell. Each columns in a database table is a field. So let’s be clear here.

A record is the basic building block for a database, and each record contains one or more fields with specific types of information. So eliza, the math teacher, can log into the database from her classroom, access the records for her students, and update the database at the same time. Ramon, the english teacher, can log into the database from his classroom, access records for his students, and update the same database. And at the same time, principal price can run a query on the student records in the database, getting information she needs to send to the district. Cool, right? Databases offer a powerful tool when you need to scale up the data you need to collect.

  1. Database Details

Robust solution for dealing with a lot of data, possibly input from multiple people over multiple locations, and so on. This episode focuses on specific elements of a database to see how they work together. To make a database do its job, we’ll tackle some new jargon words, but you’ll learn more than enough to make the jargon make sense. The first term you need to understand is relational database. The typical database has a lot of tables, with each table holding a specific set of information or data. Each table links to one or more tables according to how they relate.

An extended example will make this more clear. Let’s explore the inner workings of the student gradebook database instituted in Principal Prices school, the database is installed on the main school server, accessible to all the teachers so they can input grades in various subjects for all the students in the school. Principal Price uses the database to track students throughout the semester and from year to year. She can also use the yearly reports to show progress in the school to the district administration and up to the state level. This is good stuff.

Let me ask you a question. To organize a school wide student database, what would you expect to use as a basic record? A student, right? At first blush, organizing the records by student name wouldn’t make sense. But what do you do with students with the same name? John Smith or Jose Garcia aren’t uncommon names, after all, it would make more sense to have a unique element like a record number, as the distinguishing characteristic for each record.

That unique element is called the primary key. When you create a relational database like this gradebook example, you’ll break out each component into its own table and then link the tables together according to how they relate to each other. So for example, you’d have the student table with various pieces of information. You’d have a separate table for the subjects taught and another table that simply links them together like an enrollment table. Note that each table has a unique primary key. Plus, each table has multiple data entry points. The field Headers the student table here has four fields ID, last name, first name, and date of birth. When you have a table that gets information from another table, like the enrollment table, getting both a student ID and a subject ID, the table uses the primary keys from each of the other tables. A primary key used in another table gets a special name called a foreign key.

Each of the tables in this database link together in some way they’re related. Thus relational Database the classes table, for example, has various fields, one of which is the teacher ID. It relates to the teacher’s table. Let’s parse this a moment. Each class has an ID, right? The primary key number. Each class has a name and description. Plus here at least, each class, like Math, only has one teacher that teacher field draws from the teacher table.

The teacher table has a primary key plus a last name and first name of the teacher. Each field and each table has to contain a specific kind of data. For example, the name fields and the student and teacher fields are just characters or strings. But look at the assignments table. See the total points and earned points fields. These have to be numbers, not letters.

The database designer puts a constraint on each of those fields so people can only enter numbers or integers. A constraint determines the type of data in a field. Breaking down each element of a database into specific tables helps make the relational database super fast and efficient. Let me note that you have to get all these elements correct upfront to make the database work well. These elements are called the schema. In a relational database, all the tables relate to one another in some way. This sort of database is what you’ll see and use all over the place.

Out in the real world, relational databases enable you to store huge amounts of information and get query returns quickly. So let’s wrap up the data structure episode with queries, running questions about various fields and tables and then showing the results. For example, Principal Price wants to check on the standardized test scores for Math for the last three years.

The information is in the database. She’ll run a query, most likely using a form set up by the database designer. Underneath, though, that form uses a query language called Structured Query Language, or SQL. A command you’d see, if you typed it directly, would be select for choosing data sets like Math standardized test scores for year one, year two, and so on. The queries will return relevant information that Principal Price can review and share with her superiors. At the district level, relational databases contain multiple related tables and can store enormous amounts of data. The linked tables enable fast and efficient queries on any number of data fields.

  1. AMA – Ask Me Anything about Databases

Hey, Shannon. Hey, Scott. What are you working on? I’m working on ITF. Well, how about that? You know, I know a book. Yeah, I’ve heard that. Got a couple of questions for you. Lay it on me. Let’s talk about databases. Okay. All right. First of all, if I update my database and there’s no save button, how do I know that my data is safe? Right?

One of the key factors with a database is changes are made automatically. They’re saved automatically. And the reason why is because when you have multiple people working on a database at the same time, you want to have all of that information just updated at the same time so that every person sees everything. Right? That then begs the question is, how do you know your stuff is safe? How do you have data? Persistence is the word we’re looking for. And that goes back to how the server is set up, the thing that’s supporting the database, how the network tech or whoever set it up properly. Does it have good backups? Is there data redundancy?

For example, if a hard drive dies, does it automatically failover to another hard drive? All of those features are what give a database persistence for the data. So you know your stuff is safe, but it’s all based on more what’s happening on the server side as opposed to on your local. Okay. Okay, so along with that, I’ve heard the word IOPS. Okay, so when you have a central database, you’re also talking about performance, right? And we talk in terms of input output per second, like how much stuff you can do on a database. And not just you, right, but how much stuff all of the users at the same time can do on a database. The more IOPS, the better. Now, when you’re dealing with, say, a Soho network, a small one with, say, five users, IOPS is important, but there’s a certain level of performance that you can get, and it’s going to be acceptable.

Now, scale that same database up to, say, like a state level, you need way better performance. And to get that performance, you need to throw way more hardware at it. So faster CPUs, more Ram, better hard drives, better networking that will improve your IOPS and make the experience for all of your users much, much better. Okay, that’s the server that I want. Is there a downside now? Yes. The more hardware you throw at a server, the more money it’s going to cost. So, of course, the better the server, the more I ops and the more expensive, of course, had to be a downside. All right, so what else? Another question. How do you get the information into the database? That’s a good question. That goes back to when we first talked about using spreadsheets, right? And then going from spreadsheets into the much more robust databases.

All that information is still there in the spreadsheet. You want to be able to get it into the database somehow. And the way we do that is through what’s called a CSV file. CSV comma separated values. Okay. In the spreadsheet you can export a table and the data in rows and columns and cells. All the data, all the cells get separated into terms with little commas comma separated values. Yeah, exactly. And the database, you import that CSV file into the database and the database just picks it up, puts it into tabular format, and can use the data just like that. So a CSV file is how we import data into a table. Got it. Speaking of databases yes, I know about relational databases. Is there such a thing as a non-relational database? Yes, absolutely.

With relational databases, that’s what you’re going to see more often than not. But there are definitely non relational databases, and we call them usually no SQL databases. The difference really is with a relational database, you need to set up all the schema and stuff ahead of time, and so that all of the linkages of the tables are all perfect and tight with a no SQL database, instead they’re more adaptable or flexible. So if you get a completely new thing of information, you can just add that to your database. The downside with a NoSQL database is that that flexibility means it doesn’t have quite the performance that a relational database has. Now don’t get me wrong, the no SQL databases have structure, they’re just not the same as relational databases. And that flexibility. I mean, they still have things like primary keys, although they call them attributes, go figure. But they’re more adaptable and extensible and you’ll mainly see relational databases in the real world.

So there you go. Okay, so what else? Last question. Let’s say I’ve got a database on a central server, and I have users that are accessing it, and maybe some of this information I don’t want some users to access and others to access. Can you do permissions? Yes. In fact, you can use SQL commands like you can create to create user accounts. You can use the grant command to change data permissions to data definitions. Really? So that you can grant permissions to various user accounts? Yes, you can do that. Although more often than not, you’ll use a third party client program and a server program gram to handle all of the user accounts and permissions. But on an exam in your near future, you might see things like the create command or the grant command. So if I have different tables that some people need to access and other people don’t, other people don’t just know that you can grant permissions? Absolutely. Okay, good to know. Well, that’s all the questions that I have. That’s great. Good luck. Thanks, Scott.

img