SQL Explained (By a Non-Developer)

Welcome to the second installment of Developer lingo explained by a non-developer. How do you follow up with a term so widely used and intricate to modern development, like JavaScript? I decided to take it back to a more tried and true function within software development, SQL.

SQL stands for Structured Query Language and is used to communicate within and manipulate relational databases. It is good at handling structured data that includes relations between entities and variables of those entities.

Another Brief History Lesson

SQL has been around for quite some time with the first emergence of the language in the 1970’s when IBM researchers decided to take the findings from Edgar Frank Todd’s paper, “A Relational Model of Data for Large Shared Data Banks,” and create a query language originally called SEQUEL. Yup, just like JavaScript, SQL used to have a different name. Unlike JavaScript’s name change, SQL was changed due to trademark infringement, there were no sneaky marketing ploys hanging around. This name change happens around 1973, but I couldn’t find much on an exact date. Although the named changed, there is still an ongoing debate even today on the pronunciation of SQL, some saying “S-Q-L” while others say “sequel”. How do you pronounce SQL?

SQL slowly became the universal standard for relational database management systems(RDBMS) and remains an important player in the database world, but it wasn’t actually available to the public until 1979. A company called Relational Software released its commercial version of SQL called Oracle V2. If you haven’t already guessed, Relational Software would later become Oracle. After SQL became commercialized, products like MySQL were born out the desire of companies to not spend time managing their own database or writing their own SQL from scratch. It remains one of the most popular SQL database management systems today and set a standard for all database management systems that exist today.

To SQL or To NoSQL, That is THE Question

Ah, the question. THE question that everyone and anyone has heard, debated, and discussed if they know even a sliver of information about databases. First off, I will share my favorite blog on comparing SQL and NoSQL, written by HarperDB CTO, Kyle. He does an excellent job explaining the difference and where HarperDB fits in (we are kind of an oddball 🙂 in a good way). We also have an excellent article written by my colleague, Margo, on the different types of database architectures.

So when I was learning about SQL and NoSQL my first question was, if SQL is great, why do we need NoSQL? Well, like almost anything in the world, SQL has limitations. Relational databases typically scale better vertically than they do horizontally. What do I mean when I say scaling horizontally and vertically? Well that could in itself be an entire blog, so I will share this great resource summarizing scalability and what it means to scale in both directions. Keep in mind that scalability “is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth.”

SQL is powerful and can handle a variety of data. It can handle the data loads that we used to see back in the 1970’s, 80’s, and 90’s. As the world became more advanced and new technology emerged, we started to see more and more AND more data. Data sets grew, and they are still growing today especially with the rise of IoT and edge computing. Relational databases couldn’t keep up and it doesn’t help that SQL is, ehehmmm, structured and requires a schema for your data. This is where NoSQL comes in.

Large, usually unstructured data can be anything from tweets, emails, images, and documents, to video and audio files. As for SQL and RDBMS, relational literally means data is related rather than random, and these types of databases are typically tabular so at their core, they are not designed for unstructured data. (However, some relational databases today have built in functionality to handle this problem). Not to mention, the scale! As Kyle stated in his blog, relational databases can scale vertically, but this has limits and usually means migrating to a larger server. This process can be expensive, tedious, and frankly terrifying because so many things can go wrong and in 2020, data is gold, so losing it is kind of a big deal. NoSQL means, ehehmmm, no structure or schemaless, meaning you can just throw your data in, no structure necessary. You may also hear the term dynamic schema, which means the data structure changes dynamically as new data is ingested.

Think about the structure of NoSQL being a bucket you just dump data into and that data can still be sorted and interacted with, while SQL is a book shelf that you carefully plan out how you want to categorize your data.

The other main differences include SQL being table based while NoSQL is key value or document based, both of these relating back to the structured versus unstructured concepts. Think about the structure of NoSQL being a bucket you just dump data into and that data can still be sorted and interacted with, while SQL is a book shelf that you carefully plan out how you want to categorize your data. It’s faster and easier to dump data into the bucket and this bucket can hold a lot more data without the walls of the bookcase, but it’s harder to quickly find what you need. The book case is great because it keeps everything in a neat and tidy system, but what if piece of data could go on two different shelves or NEEDs to be on both shelves?

What is your favorite thing about SQL databases? What about NoSQL? Here is another great source, where they break SQL and NoSQL down like two different towns.

SQL is Here to Stay: NewSQL

Many databases claim to be NewSQL as it seems to be a hot new buzzword to some folks. It is a valid database category in my opinion, but there is confusion around how to group databases under NewSQL. As one of our co-founders at HarperDB wrote in his recent blog about NewSQL:

When we look at what defines a “NewSQL” database we see a few key things:

  • “…a class of relational database management systems that seek to provide the scalability of NoSQL systems… “
  • “….while maintaining the ACID guarantees…”
  • “Many enterprise systems … are too large for conventional relational databases, but … are not practical for NoSQL systems.”
  • “The only options previously available for these organizations were to either purchase more powerful computers”
  • “…or to develop custom middleware…”
  • “Both approaches feature high infrastructure costs and/or development costs. NewSQL systems attempt to reconcile the conflicts.”

When we first started HarperDB, we discussed having the capabilities to perform SQL and NoSQL and we thought people would instantly get how AWESOME that was…but they didn’t. They would respond with, So? This is because NewSQL was still a newer and not well defined term, people believed that databases like MongoDB had both because they provided something SQL-ish. It wasn’t really SQL, but to most folks it meant they could do both. (Tools like MongoDB actually incorporate complex integrations to try to achieve SQL functionality, but HarperDB provides that out-of-the-box in one simple tool). So NewSQL is the true blending of both NoSQL and SQL. Data has become so complex, with massive amounts of it coming in from millions of sources. NewSQL is the answer to our increasingly data focused world.

NewSQL is only going to keep growing and evolving and it’s an excellent choice for many use cases. For one, it’s great for startups and newer companies that haven’t dug their heels into a legacy system with complex architectures that have cost them a lot of money. Individual developers working on side projects might choose NewSQL as well. Another example is use cases where traditional ACID compliant SQLf is needed (as opposed to BASE compliance more commonly related to NoSQL databases), that also have a growing need for high performance and larger scaling capabilities at lower costs than traditional RDBMS systems stacked on top of servers. NoSQL database users might also see a need to implement ACID compliant SQL while maintaining the performance and scale that they have come to love with NoSQL.Even coding schools are looking at using NewSQL products so that they can easily expose students to SQL and NoSQL with one tool.

Wrap it Up

In conclusion, as our world evolves and we become more and more technologically advanced, our data evolves right along with us. SQL is never going away, it was the first type of universally acknowledged language for databases after all. NoSQL and SQL databases are vast, mainly including their structure, scalability, and performance. Each have important roles to play in 2020 and will continue to be critical to running a business or project moving forward. NewSQL is a modern solution to a problem that didn’t really exist in the 1980’s, but now, one could argue that needing the “best of both worlds” is THE biggest problem for many enterprises and startups alike. What do you think about NewSQL?