Being a long-time devotee of all things Microsoft SQL Server, I had quite a bit of learning to do to get up to speed on our new Teradata appliance. This guide attempts to distill some of what I have learned. I will primarily focus on what I have found most lacking: examples of how to convert SQL Server T-SQL to Teradata SQL. This guide will use the vendor-supplied sample databases: SQL Server’s AdventureWorksDW2012, and the Teradata’s Express 14.0 Retail database.
Also, while this is probably self explanatory, the “mufford” object referenced in the Teradata examples is simply my own personal sandbox. You’ll need to replace any “mufford” reference to whatever your development sandbox may be named.
Release History:
- 2012-08-17 – Added UPDATE examples; reformatted
- 2012-08-12 – Added date examples
- 2012-08-08 – Initial Release
SELECT Statements
Simple SELECT
SQL Server:
Best practice is to qualify the table schema; qualifying the database is optional unless you plan to perform cross-database queries.
USE AdventureWorksDW2012; GO SELECT TOP 10 ProductKey , EnglishProductName FROM dbo.DimProduct; -- OR SELECT TOP 10 ProductKey , EnglishProductName FROM AdventureWorksDW2012.dbo.DimProduct; |
Teradata:
Teradata does not support the concept of schemas, so you would select directly from a database and table.
SELECT TOP 10 P_PartKey , P_Name FROM retail.Product; |
UPDATE Statements
Update From a Subquery
SQL Server:
To help demonstrate a more complex UPDATE, I am going to create a temporary table, insert into the temporary table, and then update that same table. Please don’t get caught up in the data so much as the overall syntax.
CREATE TABLE #UpdateTest_firstOrder ( ProductKey INT , FirstOrderNumber NVARCHAR(20) NULL ); INSERT INTO #UpdateTest_firstOrder(ProductKey) SELECT DISTINCT ProductKey FROM dbo.DimProduct; UPDATE ufo SET FirstOrderNumber = COALESCE(x.FirstOrderNumber, '-1') FROM #UpdateTest_firstOrder ufo JOIN ( SELECT ProductKey , MIN(SalesOrderNumber) AS FirstOrderNumber FROM dbo.FactInternetSales GROUP BY ProductKey ) AS x ON ufo.ProductKey = x.ProductKey; |
Teradata:
We’re going to do the same thing in Teradata: create a temporary (volatile) table, populate it, then perform an update. However, there are 2 different ways to write the UPDATE statement. Both are shown here.
CREATE VOLATILE TABLE UpdateTest_firstOrder ( L_PartKey INTEGER , FirstOrderKey INTEGER NULL ) PRIMARY INDEX ( L_PartKey ) ON COMMIT PRESERVE ROWS; INSERT INTO UpdateTest_firstOrder(L_PartKey) SELECT DISTINCT P_PartKey FROM retail.Product; -- Option 1 UPDATE UpdateTest_firstOrder FROM ( SELECT L_PartKey , MIN(L_OrderKey) AS FirstOrderKey FROM retail.Item GROUP BY L_PartKey ) AS x SET FirstOrderKey = COALESCE(x.FirstOrderKey, -1) WHERE UpdateTest_firstOrder.L_PartKey = x.L_PartKey; -- Option 2 UPDATE UpdateTest_firstOrder SET FirstOrderKey = FirstOrderKey WHERE EXISTS( SELECT L_PartKey , MIN(L_OrderKey) AS firstOrderID FROM retail.Item WHERE UpdateTest_firstOrder.L_PartKey = retail.Item.L_PartKey GROUP BY L_PartKey ); |
DELETE Statements
Delete Top X
SQL Server:
This is pretty straight-forward in SQL Server.
DELETE TOP (1) FROM dbo.DimEmployee WHERE BirthDate < '2012-01-01'; |
Teradata:
Because Teradata is intended to be a data warehouse, not an OLTP system, this is a little more involved. We need to first create a temporary table that contains the rows we wish to remove, then perform the DELETE using the IN clause.
CREATE TABLE mufford.ToBeDeleted AS ( SELECT TOP 1 EmpNo FROM retail.Employee WHERE DOB < DATE'2012-01-01' ) WITH DATA; DELETE FROM retail.Employee WHERE EmpNo IN (SELECT EmpNo FROM mufford.ToBeDeleted); |
Delete where a relationship exists
SQL Server:
DELETE fis FROM dbo.FactInternetSales AS fis JOIN dbo.DimSalesTerritory AS dst ON fis.SalesTerritoryKey = dst.SalesTerritoryKey WHERE dst.SalesTerritoryCountry <> 'United States'; |
Teradata:
Again, you can do this by using a subquery.
DELETE FROM retail.Item WHERE L_PartKey IN ( SELECT P_PartKey FROM retail.Product WHERE P_Mfgr = 'Manufacturer#1' ); |
Delete where a relationship does not exist:
SQL Server:
DELETE fis FROM dbo.DimProduct AS dp LEFT JOIN dbo.FactInternetSales AS fis ON dp.ProductKey = fis.ProductKey WHERE fis.SalesOrderNumber IS NULL; |
Teradata:
Once more, we’ll use a subquery to perform our delete.
DELETE FROM retail.Item WHERE NOT EXISTS( SELECT P_PartKey FROM retail.Product WHERE retail.Item.L_PartKey = retail.Product.P_PartKey ); |
Temporary / Volatile Tables
Creating a simple temporary table
SQL Server:
The simple way:
SELECT LastName, FirstName INTO #MyTempTable FROM dbo.DimEmployee WHERE DepartmentName = 'Engineering'; |
The verbose, “best practice” way:
CREATE TABLE #MyTempTable ( LastName NVARCHAR(50) , FirstName NVARCHAR(50) ); INSERT INTO #MyTempTable ( LastName , FirstName ) SELECT LastName , FirstName FROM dbo.DimEmployee WHERE DepartmentName = 'Engineering'; |
Teradata:
Teradata’s equivalent of temporary tables is volatile tables.
CREATE VOLATILE TABLE MyTempTable AS ( SELECT Name FROM retail.Employee WHERE DeptNo = 1420 ) WITH DATA ON COMMIT PRESERVE ROWS; |
Using Dates
Find Within Last X Days
SQL Server:
SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE OrderDate >= DATEADD(day, -30, GETDATE()); |
Teradata:
SELECT L_OrderKey FROM retail.Item WHERE l_shipdate >= CURRENT_DATE - INTERVAL '30' DAY; |
Calculate the Number of Days Between 2 Dates
SQL Server:
SELECT TOP 10 OrderDate , ShipDate , DATEDIFF(day, OrderDate, ShipDate) AS 'daysLapsed' FROM dbo.FactInternetSales; -- OR SELECT TOP 10 OrderDate , ShipDate , CAST(ShipDate - OrderDate AS INT) AS 'daysLapsed' FROM dbo.FactInternetSales; |
Teradata:
SELECT L_ShipDate , L_ReceiptDate , L_ReceiptDate - L_ShipDate AS "daysLapsed" FROM retail.Item; |
Using dates in variables
SQL Server:
CREATE PROCEDURE spReturnOrdersBetweenDaysExample ( @StartDate DATETIME , @EndDate DATETIME ) AS SELECT * FROM dbo.FactInternetSales WHERE OrderDate >= @StartDate AND OrderDate < @EndDate; EXECUTE spReturnOrdersBetweenDaysExample '2005-07-01', '2005-08-01'; |
Teradata:
We’ll use a macro instead of a stored procedure for returning this same type of data in Teradata.
CREATE MACRO mufford.mReturnOrdersBetweenDays ( startDate DATE , endDate DATE ) AS ( SELECT * FROM retail.Item WHERE L_ShipDate >= :startDate AND L_ShipDate < :endDate; ); EXECUTE mufford.mReturnOrdersBetweenDays('1993-09-01','1993-10-01'); |
Note: if you get a “An owner referenced by user does not have SELECT WITH GRANT OPTION” error when you try to execute this, make sure you run this:
GRANT SELECT ON retail.Item TO mufford WITH GRANT OPTION; |
Performing date calculations using variables
SQL Server:
CREATE PROCEDURE spReturnOrdersInXDaysExample ( @StartDate DATETIME , @numberOfDays INT ) AS SELECT * FROM dbo.FactInternetSales WHERE OrderDate >= @StartDate AND OrderDate < DATEADD(day, @numberOfDays, @StartDate); EXECUTE spReturnOrdersInXDaysExample '2005-07-01', 30; |
Teradata:
We’ll use a macro instead of a stored procedure for returning this same type of data in Teradata.
Teradata:
CREATE MACRO mufford.mReturnOrdersInXDaysExample ( startDate DATE DEFAULT CURRENT_DATE , numberOfDays INTEGER DEFAULT 30 ) AS ( SELECT * FROM retail.Item WHERE L_ShipDate >= :startDate AND L_ShipDate < :startDate + CAST(:numberOfDays AS INTERVAL DAY); ); EXECUTE mufford.mReturnOrdersInXDaysExample('1993-09-01', 30); |
Miscellaneous
NULL Handling
SQL Server:
SELECT COALESCE(StandardCost, 0) AS [StandardCost] FROM dbo.DimProduct; -- OR SELECT ISNULL(StandardCost, 0) AS [StandardCost] FROM dbo.DimProduct; |
Teradata:
SELECT COALESCE(P_RetailPrice, 0) AS "RetailPrice" FROM retail.Product; |
Awesome Michelle!!!!!!
Since it looks like you might have already found from personal experience what i am looking for i thought i would ask. I have been asked to migrate tables from sql server to teradata and was just looking for a list of datatypes that would make good matches.
Thanks
Dan
Pingback: SQL Server
It should be noted when populating a table in Teradata using CREATE {VOLATILE} TABLE AS ( SELECT ColA, ColB, ColC FROM TableX) WITH DATA; that Teradata will use the first column in the SELECT statement as the Primary Index. This can lead to poor data distribution (skewing) and poor performance when accessing the table subsequently.
Instead you can explicitly state what the resulting primary index is using CREATE {VOLATILE} TABLE AS (SELECT ColA, ColB, ColC FROM TableX) PRIMARY INDEX (ColA, ColB) WITH DATA;
Pingback: The SQL Server DBA's Guide to Teradata SQL Fool