SQL coding guidelines

Although this guide is only available for versions 6.2 and later it applies to older version for the most parts

This document is meant to describe AGRs SQL coding guidelines and best practices. The contents of this document are changing and evolving as we grow and none of it’s contents are set in stone. If you have any questions or comments on the content feel free to contact us and discuss it with us. Also note that these are guidelines not rules. There are always exceptions where we need to deviate from the best practices found here.

This document is useful for:

  • New employee training
  • Making code more readable
  • Making fewer errors in code

Tooling

SQL Server Management Studio (SSMS)

When using SQL Server there really is only one tool that allows you to do it all, write SQL code & manage your server. This tool is SQL Server Management Studio or for short: SSMS. For older versions of SQL Server there was a corresponding version of SSMS (e.g. SSMS 2012 for SQL Server 2012) but Microsoft has now broken the link and publishes SSMS separately and more frequently. When this document was written the current SSMS version was 18.0. SSMS is a free tool and can be downloaded from the Microsoft homepage:

https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms (If this link does not work anymore, just google it!)

The later versions of SSMS are backwards compatible so when working on older customer setups the latest version can be used or the corresponding SSMS version. If you have never used SSMS there is an endless supply of beginner videos on youtube!

Visual Studio (VS)

Developers who write SQL code do much of their work in Visual Studio since it has all kinds of integrations to SQL Server. All of our database source code is maintained in Visual Studio Database projects.

Redgate tools

As good as SSMS is it lacks in some areas for more advanced users and those who write SQL all day. Redgate software offers a wide range of SQL tools, some of which AGR has decided to use:

  • SQL Search: The ultimate search engine for SQL Server. You can search for tables, procedures, triggers, functions and more and even search for code/names within objects. The only thing it does NOT search is the data itself. This tool is free and integrates into SSMS
  • SQL Prompt: Helps you write code faster and better. It has superior code completion, error checking, snippets, SQL formatting and more. Everyone that does any amount of SQL coding should be using this tool. It requires a license and integrates into both SSMS and Visual Studio
  • SQL Compare: Compares databases and allows you to generate upgrade scripts or automatically upgrade databases (partially or completely). This is a stand-alone tool that requires a license
  • SQL Data Compare: Pretty much the same as SQL Compare except it compares data instead of structure

If you wish to use some of the Redgate tools that require a license contact the dev team for more info.

inst

Tabs, spaces and indentation

Tab in source code is from the devil himself and everyone should set all editors to insert spaces instead of tab. We use 4 for indentation.

For SQL Server Management Studio and Visual Studio you select Tools-Options and under Text Editor you change the settings. Other editors usually have something similar. inst

Writing SQL

AGRs system has heaps of SQL code. Core code is mostly written by developers, but the consultants also customize the system making changes to database code. There is a lot of business logic in the SQL code. Standardized formatting and some ground rules make sure the code is more readable and less prone to errors

Case sensitivity

Most SQL server setups do not use a case sensitive collation but every now and then we encounter a case sensitive setup and for that reason we must make sure that our code is written in a manner that is considered “case sensitive safe”. This means that if you create a table called sale_table (all lowercase) you must always refer to it in code with the same case. You can identify if the database is case sensitive by checking the collation, if it has CI in the name that means it is case-insensitive or if it has CS in the name that means it is case sensitive

https://en.wikipedia.org/wiki/Case_sensitivity inst

Commenting your code

Writing good and clear code is not enough. You MUST PUT COMMENTS in your code. More is better. Below are some ground rules:

Rule #1: Header and change history

All procedures, functions and views should have a comment header which includes the following:

  • Author
  • Create date
  • Short description
  • Change history: Date – initials – Jira ref - change description If possible, include a Jira issue reference. Example of a good header: inst

Rule #2: Inline comments

If your code is anything other than a basic select statement you will probably need some inline comments describing the logic and the overall flow. Like in the header, it’s a good idea to include a date and the Jira issue number even if it is already in the header because the header will not tell you exactly which parts of the code were altered.

Rule #3: Old code in comments

Some people leave old code in a comment block in case they need to revert. This is ok but can become messy if there are multiple versions of old code in comment blocks so tidy up your code occasionally. An alternative would be to make a copy of the procedure/view and adding a _backup + _date suffix. Example: inst

Formatting your code

Everyone who has written SQL code has an opinion on how the code should be formatted. Some rules are universally accepted, some are just my personal opinion.

Rule #1: Keyword uppercasing

All keywords should be upper-cased. Example:

SELECT
    mt.myfield
FROM
    mytable mt
    INNER JOIN yourtable yt ON mt.id = yt.id
GROUP BY
    mt.myfield
HAVING
    SUM(mt.amount) > 0

Rule #2: Indentation

Use of proper indentation makes the code much more readable. There are many versions of indentation. Some offer better readability but take up more space, some do the opposite. Instead of describing the exact format, let’s show some examples:

SELECT
    id.item_id,
    COUNT(*)
FROM
    dbo.item i
    INNER JOIN dbo.item_detail id ON id.item_id = i.item_id
    LEFT JOIN dbo.item_extra_info iei ON iei.item_id = id.item_id
WHERE
    i.item_id BETWEEN '0a' AND '0b'
    AND i.category = 'MyCategory'
GROUP BY
    i.item_id
HAVING
    SUM(i.price) > 0

CREATE PROCEDURE dbo.dummy
(
    @mod  INT,
    @high INT
)
AS
BEGIN
    SET NOCOUNT ON

    DECLARE @i INT = 0
    WHILE @i < @high
    BEGIN
        IF @i % @mod = 0
            PRINT @i
        SET @i = @i + 1
    END
END

Rule #3: Aliasing

Every query that joins 2 or more tables should use proper aliasing. All columns in table should be referred with this alias. Let’s imagine we have 2 tables, item and item_details:

CREATE TABLE dbo.item
(
    item_id     INT PRIMARY KEY,
    description NVARCHAR(100),
    price       DECIMAL(18, 4)
)

CREATE TABLE dbo.item_details
(
    item_id    INT PRIMARY KEY,
    extra_info NVARCHAR(100)
)

Now we build a query

SELECT
    i.item_id,
    extra_info
FROM
    dbo.item i
    INNER JOIN dbo.item_detail id ON i.item_id = id.item_id
WHERE
    price > 100.00

This query will run fine, so we put it in production. Now someone decides it’s a good idea to add a price column in item_details with more accuracy:

ALTER TABLE dbo.item_details ADD price DECIMAL(20, 8)

This breaks the query with the following error:

Msg 209, Level 16, State 1, Line 1 - Ambiguous column name ‘price’.

The original query should have been like this:

SELECT
    i.item_id,
    id.extra_info
FROM
    dbo.item i
    INNER JOIN dbo.item_detail id ON i.item_id = id.item_id
WHERE
    i.price > 100.00

Rule #3: Schema qualification

Make sure you always use full names with schemas in your queries.

DON’T DO THIS:

SELECT
    i.item_id,
    id.extra_info
FROM
    item i
    INNER JOIN item_detail id ON id.item_id = i.item_id

DO IT LIKE THIS:

SELECT
    i.item_id,
    id.extra_info
FROM
    dbo.item i
    INNER JOIN dbo.item_detail id ON id.item_id = i.item_id

There are a lot of reasons for using schema names in queries. For example you could have another item table in another schema, e.g. inv.item. If the user logged in has default schema = inv then he would be querying inv.item when querying without schema instead of the intended dbo.item.

More info here: https://www.red-gate.com/simple-talk/blogs/why-you-should-always-use-schema-name-in-queries/

Rule #4: Reserved keywords

Try to avoid using reserved keywords as table or column names. Reserved keywords CAN be used but you will need to use square brackets in your queries if you do so which is a hassle, so let’s just try to avoid this. Example: Let’s create a table that breaks all the rules:

CREATE TABLE [from]
(
    [select] NVARCHAR(20),
    [*]      NVARCHAR(20),
    [from]   NVARCHAR(20),
    [where]  NVARCHAR(20)
)

Now we can make an “excellent” query:

SELECT
    [select]
FROM
    [from]
WHERE
    [where] = '?'

Rule #5: A few more DON’Ts

  • Never NEVER use SELECT * in production code. This can cause all sorts of problems, just google it!

  • Don’t insert into tables without a using a column list.

    DON’T DO THIS:

    INSERT INTO dbo.item
    VALUES (123,'Item 123',19.99)

    DO THIS:

    INSERT INTO dbo.item (item_id, description, price)
    VALUES (123, 'Item 123', 19.99)

  • In SELECT statements, don’t select columns you don’t need. Selecting all columns from a table and then only really using one does not pose any threat, but it can slow the query down and prevent usage of relevant indexes.

  • Try to avoid cross-database reference by actual database name. Example:

USE agr5_stg
GO
CREATE VIEW rep.item_test AS
SELECT
    i1.id,
    i1.name AS stg_name,
    i2.name AS prod_name
FROM
    agr5_stg.rep.items i1
    INNER JOIN agr5_prod.dbo.items i2 ON i2.id = i1.id
WHERE
    i1.primary_vendor_id <> i2.primary_vendor_id

This query has 2 problems. 1) This view exists in the agr5_stg database but yet the first items table is referenced by it’s full 3 part name (database.schema.objectname): agr5_stg.rep.items. What will happen if you make a copy of this database, e.g. agr5_stg_test, is that this view will still be referencing the original database. Also if you rename the database this view will stop working. 2) Joining outside the database. This will fail in the same way as above if you rename or make a copy Problem number 1 should be fixed by using 2 part naming (schema.objectname). Problem number 2 can be fixed by using synonyms or dynamic SQL. Synonyms will also fail when renaming or making copy but at least it’s easy to script out all synonyms when doing so and changing them all at once. Note that you might run into a problem when replacing a table with synonym in that TRUNCATE can’t be used on synonyms. In that case you can use DELETE if the table is small.

Shortcuts

Work smart, not hard and please don’t be a point and click person. You can work so much faster and more efficiently if you use your keyboard properly. SSMS has tons of useful shortcuts. The list is way too long to list in this document but here is a shortcut to a complete list:

https://docs.microsoft.com/en-us/sql/ssms/sql-server-management-studio-keyboard-shortcuts

You can also customize your shortcuts and add new ones. Take a few minutes every now and then to explore new shortcuts. There are also tons of articles and YouTube videos available, for example:

https://www.sqlshack.com/10-ssms-tips-and-tricks-to-boost-your-productivity/

https://www.youtube.com/watch?v=C4QhymuG0yk

Here is a short list to get you started:

Shortcut Action
ALT + up/down arrow Move current line up/down
SHIFT + DEL Delete current line
SHIFT + ALT + arrows Block select text
CTRL + left/right arrow Jump cursor to previous/next word in line
CTRL + up/down arrow Scroll up/down without moving cursor
F5 Execute query in current window or if text is highlighted it will execute that part only
F7 Open object explorer
CTRL + N New query window
ALT + BREAK Cancel running query
CTRL + K + C Comment out selected lines
CTRL + K + U Uncomment selected lines
CTRL + SHIFT + U Uppercase selected text
CTRL + SHIFT + V Paste ring buffer – hold CTRL + SHIFT, press V a few times and see what happens!

Object naming

Abbreviations

Don’t abbreviate to the point that the names become hard to understand just to save a few characters. People used to abbreviate much more in the “old days” because of size and space limitations and to be able to write code faster. With today’s intellisense and auto-complete we just don’t need to anymore. Also keep in mind that company “lingo” that seems logical for someone that has worked at AGR for years might be difficult to understand for the newbies.

BAD GOOD
itm_ldgr item_ledger
itm_fct Item_forecast
ppl_orders populate_orders

Camel casing, snake casing or space?

When we name an object, we have a few options when it comes to names which consist of multiple words. For those not familiar with camel casing I suggest reading up on it here:

https://en.wikipedia.org/wiki/Camel_case

https://en.wikipedia.org/wiki/Snake_case

Type Example
Camel case itemLedgerEntry
Snake case (underscore) item_ledger_entry
Space item ledger entry

All of the above naming conventions are valid and each having its pros and cons. To maintain consistency, we need to choose one and since it seems that most of AGRs SQL code uses snake case we will keep using that.

Singular or plural

inv.item or inv.items ?

rep.item_details, rep.items_detail, rep.items_details or rep.items_detail ?

It’s so confusing…

There are many arguments for using singular and many for plural, but the most accepted way is using singlular so let’s do that. Much of the older code uses plural but we are gradually moving towards singular.

rep.item it is!

Ordering and grouping

SSMS lists the objects in alphabetical order. Preferably we would like objects to be logically grouped together so we can find them more easily. Let’s create a hypothetical example: We have a few special_calc tables that include some special calculations. We need to do those calculations for items, stocks and histories_sale. If we give these tables a suffix like so:

  • rep.histories_sale_special_calc
  • rep.items_special_calc
  • rep.stocks_special_calc

The tables are scattered: inst

But if we make the “special_calc” a prefix instead of suffix the tables will be logically grouped: inst

This may not be a perfect example because when adding a suffix, the new tables are logically grouped with their parent tables (e.g. items -> items_special_calc) which is also a good grouping so as you can see there is no correct solution here. It all depends!

Customization

Almost all AGR setups have some customization. When doing customizations, try to use the cus schema or put a cus_ prefix on your custom objects. If modifying core procedures or views add a comment above your custom section containing the words: “Custom” or “Customization” along with a short description and preferably a reference to a document describing the functionality or a Jira issue number.

Object naming conventions

Below is a list describing naming rules

Object type Prefix Rule / example
Index IX_ IX_schema_table_fields / IX_rep_item_name
Unique index UQ_ UQ_schema_tablename_columnname1_columnname2… / UQ_rep_item_item_no
Check constraint CK_ CK_schema_tablename_columnname / CK_rep_item_closed
Default constraint DF_ DF_schema_tablename_columnname / DF_rep_item_active
Primary key PK_ PK_schema_tablename / PK_rep_item
Foreign key FK_ FK_schema_tablename_dstschema_dsttablename_columnname1_columnname2… / FK_rep_item_rep_itemgroup_item_group_id
Stored procedure NOT_SP No rule except don’t give procedures a sp_ prefix (sp_ prefix makes SQL server think this might be a system procedure which adds overhead)
Table function FNT_ fnt_function_name
Scalar function FN_ fn_function_name
Table No rule
View v_ All views should start with V_ / v_user_access
Trigger tr_ tr_schema_tablename_extrainfo* / tr_rep_item_send_new_items_to_broker_queue

* extrainfo can be something describing the functionality of the trigger

CTE naming

When writing complex CTEs with multiple queries consider using a cte_ prefix. Example:

;WITH cte_item AS
(
    SELECT
        i.item_id,
        i.description
    FROM
        dbo.item i
),
cte_item_details AS
(
    SELECT
        i.item_id,
        id.extra_info
    FROM
        cte_item i
        INNER JOIN dbo.item_details id ON i.item_id = id.item_id
)
SELECT
    *
FROM
    cte_item_details cid
    INNER JOIN dbo.item i ON i.item_id = cid.item_id

This query of course makes no sense at all except for proving a point. If I had named the first cte “item” instead of “cte_item” I might confuse the cte with the actual table name dbo.item in my later cte.

Data types

Joining on different data types can cause implicit conversion which can kill your performance. Using too big of a type can slow down queries. Make sure you use the correct data types, there are not that many datatypes in sql server. Below is a list of a few good rules when it comes to picking the right types:

Rule #1: Always use NVARCHAR or NCHAR

An architectural decision has been made to use NVARCHAR as the primary datatype for all text in AGR. Most text fields were already using NVARCHAR (or NCHAR) but as of version 6.2 all existing code has been converted to NVARCHAR and all new code will use NVARCHAR.

Rule #2: Use DATETIME2

As of version 6.2 we only use DATETIME2 (usually DATETIME2(0) is enough) instead of DATETIME. The reason is that the new datetime type offers greater range for less storage space and is recommended by Microsoft When populating DATETIME2, make sure you use SYSDATETIME() instead of GETDATE() because SYSDATETIME() returns DATETIME2 with proper precision

More information here: https://database.guide/datetime-vs-datetime2-in-sql-server-whats-the-difference/

Rule #3: Don’t use text types (char/nchar/varchar/nvarchar) for numbers or dates

This one is simple and seems obvious but seems to be broken quite a lot. If you have a column that contains only numbers or dates, don’t use text data types. There is no check to see if the numbers or dates are valid.

Rule #4: Use the appropriate type and watch the size

Make sure you are not using too big of a datatype. For example:

  • If you are storing a boolean flag use BIT, not INT
  • If you are storing simple status codes, for example number ranging between 1 and 10, use TINYINT (range 0-255).
  • Watch out for NVARCHAR size. If you create a “name” column, it can matter if you make it NVARHCAR(50) or NVARCHAR(4000) even if you only store short texts. See more here: https://sqlperformance.com/2017/06/sql-plan/performance-myths-oversizing-strings

No matter what they say… Size DOES matter! (when we have millions and millions of rows)

Rule #5: Use DATE (not datetime) when storing a DATE

This one also might seem obvious but often we see people using the DATETIME type for storing dates. Besides taking up more space this can cause problems when selecting. Example:

We have an items table which contains a “DATE” field called last_sale_date but it’s really a DATETIME field. Let’s do an update:

UPDATE dbo.item SET last_sale_date = SYSDATETIME() where item_id BETWEEN 1 and 100

Now we wish to find all the items that had last sale date = 2019-06-01 Can you see why this would fail?

SELECT * FROM dbo.item WHERE last_sale_date = '2019-06-01'
inst

Rule #6: Numeric precision – beware of float and real

The float and real datatypes are approximate numerics and should only be used if the precision provided by decimal (up to 38 digits) is insufficient. Because of the approximate nature of the float and real data types, do not use these data types when exact numeric behavior is required, such as in financial applications, in operations involving rounding, or in equality checks. Instead, use the integer, decimal, money, or smallmoney data types.

Try this code:

DECLARE @value FLOAT(18) = 0.0
WHILE @value < 10.0
BEGIN
    PRINT @value
    SET @value += 0.1
END

Look at the output: inst

Security

Last but not least is security! Only grant users the permissions they need, don’t just give users sysadmin rights because you are too lazy to configure this properly. In many cases we are sharing a server with other systems and we most definitely do not want those systems to be compromised if our system is.

Common pitfalls

Below is a list of common pitfalls. Any SQL developer who claims he has never fallen in any of those is not less than honest.

NULLs

Those precious NULLs, a seemingly endless source of joy:

https://www.red-gate.com/simple-talk/sql/t-sql-programming/how-to-get-nulls-horribly-wrong-in-sql-server/

Over-indexing

Indexes are great and they can speed up queries, but they also take up space and slow down inserts. Each time you insert or update a row, all the relevant indexes must be updated. I have seen tables with 5GB of data and 50GB of indexes. That might be categorized as “Overkill”.

Other pitfalls

Redgate has all the answers, read this article:

https://www.red-gate.com/simple-talk/sql/t-sql-programming/ten-common-sql-programming-mistakes/