Although this guide is only available for versions 6.2 and later it applies to older version for the most parts
This document is meant to describe AGRs SQL coding guidelines and best practices. The contents of this document are changing and evolving as we grow and none of it’s contents are set in stone. If you have any questions or comments on the content feel free to contact us and discuss it with us. Also note that these are guidelines not rules. There are always exceptions where we need to deviate from the best practices found here.
This document is useful for:
When using SQL Server there really is only one tool that allows you to do it all, write SQL code & manage your server. This tool is SQL Server Management Studio or for short: SSMS. For older versions of SQL Server there was a corresponding version of SSMS (e.g. SSMS 2012 for SQL Server 2012) but Microsoft has now broken the link and publishes SSMS separately and more frequently. When this document was written the current SSMS version was 18.0. SSMS is a free tool and can be downloaded from the Microsoft homepage:
https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms (If this link does not work anymore, just google it!)
The later versions of SSMS are backwards compatible so when working on older customer setups the latest version can be used or the corresponding SSMS version. If you have never used SSMS there is an endless supply of beginner videos on youtube!
Developers who write SQL code do much of their work in Visual Studio since it has all kinds of integrations to SQL Server. All of our database source code is maintained in Visual Studio Database projects.
As good as SSMS is it lacks in some areas for more advanced users and those who write SQL all day. Redgate software offers a wide range of SQL tools, some of which AGR has decided to use:
If you wish to use some of the Redgate tools that require a license contact the dev team for more info.
Tab in source code is from the devil himself and everyone should set all editors to insert spaces instead of tab. We use 4 for indentation.
For SQL Server Management Studio and Visual Studio you select Tools-Options and under Text Editor you change the settings. Other editors usually have something similar.
AGRs system has heaps of SQL code. Core code is mostly written by developers, but the consultants also customize the system making changes to database code. There is a lot of business logic in the SQL code. Standardized formatting and some ground rules make sure the code is more readable and less prone to errors
Most SQL server setups do not use a case sensitive collation but every now and then we encounter a case sensitive setup and for that reason we must make sure that our code is written in a manner that is considered “case sensitive safe”. This means that if you create a table called sale_table
(all lowercase) you must always refer to it in code with the same case.
You can identify if the database is case sensitive by checking the collation, if it has CI in the name that means it is case-insensitive or if it has CS in the name that means it is case sensitive
https://en.wikipedia.org/wiki/Case_sensitivity
Writing good and clear code is not enough. You MUST PUT COMMENTS in your code. More is better. Below are some ground rules:
All procedures, functions and views should have a comment header which includes the following:
If your code is anything other than a basic select statement you will probably need some inline comments describing the logic and the overall flow. Like in the header, it’s a good idea to include a date and the Jira issue number even if it is already in the header because the header will not tell you exactly which parts of the code were altered.
Some people leave old code in a comment block in case they need to revert. This is ok but can become messy if there are multiple versions of old code in comment blocks so tidy up your code occasionally. An alternative would be to make a copy of the procedure/view and adding a _backup + _date suffix.
Example:
Everyone who has written SQL code has an opinion on how the code should be formatted. Some rules are universally accepted, some are just my personal opinion.
All keywords should be upper-cased. Example:
SELECT
mt.myfield
FROM
mytable mt
INNER JOIN yourtable yt ON mt.id = yt.id
GROUP BY
mt.myfield
HAVING
SUM(mt.amount) > 0
Use of proper indentation makes the code much more readable. There are many versions of indentation. Some offer better readability but take up more space, some do the opposite. Instead of describing the exact format, let’s show some examples:
SELECT
id.item_id,
COUNT(*)
FROM
dbo.item i
INNER JOIN dbo.item_detail id ON id.item_id = i.item_id
LEFT JOIN dbo.item_extra_info iei ON iei.item_id = id.item_id
WHERE
i.item_id BETWEEN '0a' AND '0b'
AND i.category = 'MyCategory'
GROUP BY
i.item_id
HAVING
SUM(i.price) > 0
CREATE PROCEDURE dbo.dummy
(
@mod INT,
@high INT
)
AS
BEGIN
SET NOCOUNT ON
DECLARE @i INT = 0
WHILE @i < @high
BEGIN
IF @i % @mod = 0
PRINT @i
SET @i = @i + 1
END
END
Every query that joins 2 or more tables should use proper aliasing. All columns in table should be referred with this alias. Let’s imagine we have 2 tables, item and item_details:
CREATE TABLE dbo.item
(
item_id INT PRIMARY KEY,
description NVARCHAR(100),
price DECIMAL(18, 4)
)
CREATE TABLE dbo.item_details
(
item_id INT PRIMARY KEY,
extra_info NVARCHAR(100)
)
Now we build a query
SELECT
i.item_id,
extra_info
FROM
dbo.item i
INNER JOIN dbo.item_detail id ON i.item_id = id.item_id
WHERE
price > 100.00
This query will run fine, so we put it in production. Now someone decides it’s a good idea to add a price
column in item_details
with more accuracy:
ALTER TABLE dbo.item_details ADD price DECIMAL(20, 8)
This breaks the query with the following error:
Msg 209, Level 16, State 1, Line 1 - Ambiguous column name ‘price’.
The original query should have been like this:
SELECT
i.item_id,
id.extra_info
FROM
dbo.item i
INNER JOIN dbo.item_detail id ON i.item_id = id.item_id
WHERE
i.price > 100.00
Make sure you always use full names with schemas in your queries.
DON’T DO THIS:
SELECT
i.item_id,
id.extra_info
FROM
item i
INNER JOIN item_detail id ON id.item_id = i.item_id
DO IT LIKE THIS:
SELECT
i.item_id,
id.extra_info
FROM
dbo.item i
INNER JOIN dbo.item_detail id ON id.item_id = i.item_id
There are a lot of reasons for using schema names in queries. For example you could have another item table in another schema, e.g. inv.item
. If the user logged in has default schema = inv then he would be querying inv.item
when querying without schema instead of the intended dbo.item
.
More info here: https://www.red-gate.com/simple-talk/blogs/why-you-should-always-use-schema-name-in-queries/
Try to avoid using reserved keywords as table or column names. Reserved keywords CAN be used but you will need to use square brackets in your queries if you do so which is a hassle, so let’s just try to avoid this. Example: Let’s create a table that breaks all the rules:
CREATE TABLE [from]
(
[select] NVARCHAR(20),
[*] NVARCHAR(20),
[from] NVARCHAR(20),
[where] NVARCHAR(20)
)
Now we can make an “excellent” query:
SELECT
[select]
FROM
[from]
WHERE
[where] = '?'
Never NEVER use SELECT * in production code. This can cause all sorts of problems, just google it!
Don’t insert into tables without a using a column list.
DON’T DO THIS:
INSERT INTO dbo.item
VALUES (123,'Item 123',19.99)
DO THIS:
INSERT INTO dbo.item (item_id, description, price)
VALUES (123, 'Item 123', 19.99)
In SELECT statements, don’t select columns you don’t need. Selecting all columns from a table and then only really using one does not pose any threat, but it can slow the query down and prevent usage of relevant indexes.
Try to avoid cross-database reference by actual database name. Example:
USE agr5_stg
GO
CREATE VIEW rep.item_test AS
SELECT
i1.id,
i1.name AS stg_name,
i2.name AS prod_name
FROM
agr5_stg.rep.items i1
INNER JOIN agr5_prod.dbo.items i2 ON i2.id = i1.id
WHERE
i1.primary_vendor_id <> i2.primary_vendor_id
This query has 2 problems. 1) This view exists in the agr5_stg database but yet the first items table is referenced by it’s full 3 part name (database.schema.objectname): agr5_stg.rep.items. What will happen if you make a copy of this database, e.g. agr5_stg_test, is that this view will still be referencing the original database. Also if you rename the database this view will stop working. 2) Joining outside the database. This will fail in the same way as above if you rename or make a copy Problem number 1 should be fixed by using 2 part naming (schema.objectname). Problem number 2 can be fixed by using synonyms or dynamic SQL. Synonyms will also fail when renaming or making copy but at least it’s easy to script out all synonyms when doing so and changing them all at once. Note that you might run into a problem when replacing a table with synonym in that TRUNCATE can’t be used on synonyms. In that case you can use DELETE if the table is small.
Work smart, not hard and please don’t be a point and click person. You can work so much faster and more efficiently if you use your keyboard properly. SSMS has tons of useful shortcuts. The list is way too long to list in this document but here is a shortcut to a complete list:
https://docs.microsoft.com/en-us/sql/ssms/sql-server-management-studio-keyboard-shortcuts
You can also customize your shortcuts and add new ones. Take a few minutes every now and then to explore new shortcuts. There are also tons of articles and YouTube videos available, for example:
https://www.sqlshack.com/10-ssms-tips-and-tricks-to-boost-your-productivity/
https://www.youtube.com/watch?v=C4QhymuG0yk
Here is a short list to get you started:
Shortcut | Action |
---|---|
ALT + up/down arrow | Move current line up/down |
SHIFT + DEL | Delete current line |
SHIFT + ALT + arrows | Block select text |
CTRL + left/right arrow | Jump cursor to previous/next word in line |
CTRL + up/down arrow | Scroll up/down without moving cursor |
F5 | Execute query in current window or if text is highlighted it will execute that part only |
F7 | Open object explorer |
CTRL + N | New query window |
ALT + BREAK | Cancel running query |
CTRL + K + C | Comment out selected lines |
CTRL + K + U | Uncomment selected lines |
CTRL + SHIFT + U | Uppercase selected text |
CTRL + SHIFT + V | Paste ring buffer – hold CTRL + SHIFT, press V a few times and see what happens! |
Don’t abbreviate to the point that the names become hard to understand just to save a few characters. People used to abbreviate much more in the “old days” because of size and space limitations and to be able to write code faster. With today’s intellisense and auto-complete we just don’t need to anymore. Also keep in mind that company “lingo” that seems logical for someone that has worked at AGR for years might be difficult to understand for the newbies.
BAD | GOOD |
---|---|
itm_ldgr | item_ledger |
itm_fct | Item_forecast |
ppl_orders | populate_orders |
When we name an object, we have a few options when it comes to names which consist of multiple words. For those not familiar with camel casing I suggest reading up on it here:
https://en.wikipedia.org/wiki/Camel_case
https://en.wikipedia.org/wiki/Snake_case
Type | Example |
---|---|
Camel case | itemLedgerEntry |
Snake case (underscore) | item_ledger_entry |
Space | item ledger entry |
All of the above naming conventions are valid and each having its pros and cons. To maintain consistency, we need to choose one and since it seems that most of AGRs SQL code uses snake case we will keep using that.
inv.item
or inv.items
?
rep.item_details
, rep.items_detail
, rep.items_details
or rep.items_detail
?
It’s so confusing…
There are many arguments for using singular and many for plural, but the most accepted way is using singlular so let’s do that. Much of the older code uses plural but we are gradually moving towards singular.
rep.item
it is!
SSMS lists the objects in alphabetical order. Preferably we would like objects to be logically grouped together so we can find them more easily. Let’s create a hypothetical example: We have a few special_calc tables that include some special calculations. We need to do those calculations for items, stocks and histories_sale. If we give these tables a suffix like so:
rep.histories_sale_special_calc
rep.items_special_calc
rep.stocks_special_calc
The tables are scattered:
But if we make the “special_calc” a prefix instead of suffix the tables will be logically grouped:
This may not be a perfect example because when adding a suffix, the new tables are logically grouped with their parent tables (e.g. items -> items_special_calc) which is also a good grouping so as you can see there is no correct solution here. It all depends!
Almost all AGR setups have some customization. When doing customizations, try to use the cus schema or put a cus_ prefix on your custom objects. If modifying core procedures or views add a comment above your custom section containing the words: “Custom” or “Customization” along with a short description and preferably a reference to a document describing the functionality or a Jira issue number.
Below is a list describing naming rules
Object type | Prefix | Rule / example |
---|---|---|
Index | IX_ | IX_schema_table_fields / IX_rep_item_name |
Unique index | UQ_ | UQ_schema_tablename_columnname1_columnname2… / UQ_rep_item_item_no |
Check constraint | CK_ | CK_schema_tablename_columnname / CK_rep_item_closed |
Default constraint | DF_ | DF_schema_tablename_columnname / DF_rep_item_active |
Primary key | PK_ | PK_schema_tablename / PK_rep_item |
Foreign key | FK_ | FK_schema_tablename_dstschema_dsttablename_columnname1_columnname2… / FK_rep_item_rep_itemgroup_item_group_id |
Stored procedure | NOT_SP | No rule except don’t give procedures a sp_ prefix (sp_ prefix makes SQL server think this might be a system procedure which adds overhead) |
Table function | FNT_ | fnt_function_name |
Scalar function | FN_ | fn_function_name |
Table | No rule | |
View | v_ | All views should start with V_ / v_user_access |
Trigger | tr_ | tr_schema_tablename_extrainfo* / tr_rep_item_send_new_items_to_broker_queue |
* extrainfo can be something describing the functionality of the trigger
When writing complex CTEs with multiple queries consider using a cte_ prefix. Example:
;WITH cte_item AS
(
SELECT
i.item_id,
i.description
FROM
dbo.item i
),
cte_item_details AS
(
SELECT
i.item_id,
id.extra_info
FROM
cte_item i
INNER JOIN dbo.item_details id ON i.item_id = id.item_id
)
SELECT
*
FROM
cte_item_details cid
INNER JOIN dbo.item i ON i.item_id = cid.item_id
This query of course makes no sense at all except for proving a point. If I had named the first cte “item” instead of “cte_item” I might confuse the cte with the actual table name dbo.item in my later cte.
Joining on different data types can cause implicit conversion which can kill your performance. Using too big of a type can slow down queries. Make sure you use the correct data types, there are not that many datatypes in sql server. Below is a list of a few good rules when it comes to picking the right types:
An architectural decision has been made to use NVARCHAR as the primary datatype for all text in AGR. Most text fields were already using NVARCHAR (or NCHAR) but as of version 6.2 all existing code has been converted to NVARCHAR and all new code will use NVARCHAR.
As of version 6.2 we only use DATETIME2 (usually DATETIME2(0) is enough) instead of DATETIME. The reason is that the new datetime type offers greater range for less storage space and is recommended by Microsoft When populating DATETIME2, make sure you use SYSDATETIME() instead of GETDATE() because SYSDATETIME() returns DATETIME2 with proper precision
More information here: https://database.guide/datetime-vs-datetime2-in-sql-server-whats-the-difference/
This one is simple and seems obvious but seems to be broken quite a lot. If you have a column that contains only numbers or dates, don’t use text data types. There is no check to see if the numbers or dates are valid.
Make sure you are not using too big of a datatype. For example:
No matter what they say… Size DOES matter! (when we have millions and millions of rows)
This one also might seem obvious but often we see people using the DATETIME type for storing dates. Besides taking up more space this can cause problems when selecting. Example:
We have an items table which contains a “DATE” field called last_sale_date
but it’s really a DATETIME field. Let’s do an update:
UPDATE dbo.item SET last_sale_date = SYSDATETIME() where item_id BETWEEN 1 and 100
Now we wish to find all the items that had last sale date = 2019-06-01 Can you see why this would fail?
SELECT * FROM dbo.item WHERE last_sale_date = '2019-06-01'
The float and real datatypes are approximate numerics and should only be used if the precision provided by decimal (up to 38 digits) is insufficient. Because of the approximate nature of the float and real data types, do not use these data types when exact numeric behavior is required, such as in financial applications, in operations involving rounding, or in equality checks. Instead, use the integer, decimal, money, or smallmoney data types.
Try this code:
DECLARE @value FLOAT(18) = 0.0
WHILE @value < 10.0
BEGIN
PRINT @value
SET @value += 0.1
END
Look at the output:
Last but not least is security! Only grant users the permissions they need, don’t just give users sysadmin rights because you are too lazy to configure this properly. In many cases we are sharing a server with other systems and we most definitely do not want those systems to be compromised if our system is.
Below is a list of common pitfalls. Any SQL developer who claims he has never fallen in any of those is not less than honest.
Those precious NULLs, a seemingly endless source of joy:
Indexes are great and they can speed up queries, but they also take up space and slow down inserts. Each time you insert or update a row, all the relevant indexes must be updated. I have seen tables with 5GB of data and 50GB of indexes. That might be categorized as “Overkill”.
Redgate has all the answers, read this article:
https://www.red-gate.com/simple-talk/sql/t-sql-programming/ten-common-sql-programming-mistakes/