Everyone should speak SQL!

Do you know that you use databases everyday? Actually databases are everywhere. When you log  in, your credentials are validated against a database. When you are querying google or bing, a large amount of data get filtered to provide you adequate results. Even your phone contact list is a database too. Etc.

More interestingly, we have entered in the era of data. For example, “Wal-Mart, a retail giant, handles more than 1m customer transactions every hour, feeding databases estimated at more than 2.5 petabytes—the equivalent of 167 times the books in America’s Library of Congress”.

Data

big-data_conew1 by luckey_sun (some rights reserved)

This accumulation of diverse data makes it possible to do many interesting things: spot behavioral or business trends, prevent diseases, extract unknown knowledge, detect fraud and prevent crime and so on.

Such advanced data analytics and data science is out of the scope of this article. First, I will present you what is SQL and what it looks like. Secondly, I will assess the advantages of learning SQL and tell you what is the expected level you should reach.

What is SQL?

SQL (pronounce it “sequel”) stands for Structured Query Language and it is the primary language to interact with Relational Database Management Systems (RDBMS) which are based on the relational model as invented by E. F. Codd.

RDBMS are widely used in the world of databases management systems. There are several alternatives like hierarchical model, object-oriented databases or graph databases. So there are several query languages like SPARQL for RDF graphs or MDX for multi-dimensional analytical databases. Each fits for a specific purpose.

In a relational model, all data are represented in terms of tuples, grouped into relations. More explicitly, data are stored in tables with related tables.

Relational data

Let’s talk of SQL. SQL is a language with a high level of abstraction, similar to natural language, then accessible to everyone.

4 sub-languages

SQL is divided in several sub-languages:

  • Data Definition Language (DDL) statements that are used to define the database structure or schema.
  • Data Manipulation Language (DML) statements that are used for managing data within schema objects.
  • Data Control Language (DCL) statements
  • Transaction Control (TCL) statements that are used to manage the changes made by DML statements.

Not all are necessary for everyone. While some are mainly used by well-defined professional (DDL by database architects, DCL by database administrators and TCL by programmers), DML should be known by everyone who uses databases.

The most known DML statements are the SELECT, INSERT, UPDATE and DELETE (a few people know the other useful statements like the MERGE which is an upsert operation, e.g. INSERT or UPDATE).

As you may see the commands are quite intuitive (SELECT for section for example).

Required level at performing DML queries

SQL queries are quite intuitives. Here is an example:

SQL statement anatomy

Sql statement anatomy by SqlPac. CC-BY-SA-3.0-MIGRATED

As you can imagine, the language allows complex queries with aggregate and analytic functions, recursion, even pattern matching in the Oracle dialect. Since it is a high level language, you have to be very careful with such advanced functionalities to get reliable results.

To understand the power of this language, take a look at the following image. It is the standard way to join data from several table into adequate sets.

SQL join

The different types of SQL join and their outputs as determined by sets, by arbek

Basic level

For basics purpose, you have to know the basics of a relational databases, being able to perform simple SELECT, INSERT and DELETE statements with the restricting WHERE clause and the ordering ORDER BY clause. You should also learn how to deal with dates and nulls, how to use the GROUP BY statement and built-in single row functions.

Intermediate level

For intermediate level, you should know every concepts which are taught in the Querying Microsoft SQL Server 2012 Databases Jump Start (and how to implement it the dialect of the RDMBS you use).

Advanced level

At an advanced level, you should be able to became Oracle certified expert and pass the Oracle Database SQL Expert 1Z0-047 exam.

The advantages of speaking SQL

No more Excel crashing

The RDBMS are designed to be extensively used and do not crash so easily.

Answer advanced questions by yourself

Once the DBAs allowed you to connect to the appropriate databases, you can answer business questions in an advanced manner (far more powerful than Excel spreadsheets). No more need to rely on pre-built reports!

Reproducibility

SQL queries can be saved and then be checked and re-used.

And last, but not least:

Being well-paid

Since employers seek for this skill, you may earn more money. Actually most of people use SQL in a bad way and being certified is a great advantage.