LearnTeradata: 2013

Wednesday, 19 June 2013

Leaaion 46 : Subquery

Subquery

Subquery is a query whithin in a query. It is also called as Inner query or Nested query. A subquery is usually added in the WHERE clause of the sql statement. Usually, a subquery is used when we know how to search for a value using a SELECT statement, but do not know the exact value.

Rules for Subqueries

· Must be enclosed in parentheses

· Can be the object of an IN or NOT IN clause

· Can be the object of EXISTS or NOT EXISTS clause

· Support quantifiers ALL, ANY, SOME

· Support LIKE or NOT LIKE used with a quantifier

· Can specify more than one column to match

· Generate a DISTINCT list of values

· Cannot use ORDER BY (within the subquery

statement).

Keep in mind that a maximum of 64

tables/views can be specified in an SQL
statement.

Thursday, 14 February 2013

45. INSERT in Teradata

INSERT

We can use INSERT to add a new row to a table.

Syntax :

INSERT INTO table_name [(col_name [, ..., col_name])] VALUES (expression [, ..., expression]) ;

Example :

INSERT INTO emp(empid, ename) VALIES(101. 'Arun Kumar');

44. DML in Teradata

DML - Data Manipulation Language,

Following are DML statements are available in Teradata,

INSERT - Inserts a single row into a table.
INSERT SELECT - Inserts one or more rows into a table using values selected from other tables.
UPDATE - Changes column values in existing rows of a table.
DELETE -Removes rows from a table.

43. CREATE TABLE in Teradata

CREATE TABLE is DDL command which is used to create teradata tables.

When executed, the CREATE TABLE statement creates and stores the table structure definition in the Teradata Data Dictionary.

The CREATE TABLE statement includes:

Create Table options

Column definitions

Table-level constraints

Index definitions

Syntax :

CREATE <SET/MULTISET> TABLE employee
<Create Table Options>
<Column Definitions>
<Table-level Constraints>
<Index Definitions>;
Where,

SET - No duplicate rows allowed

MULTISET -duplicate rows allowed

Table options are,
Fallback - copy of data is maintained by another AMP
No Fallback - No copy of data.
Journaling - Keep single/dual copy of row BEFORE/AFTER updates to the row
Freespace - % of freespace (0-75%) that will remain on a cylinder during subsequent loading operations
Datablocksize - Size of data block in which row(s) of the table are stored

Column definitions

Column name - Name the column
Data type – Declare the column to be a character, byte,numeric, or graphic data type.

Data type Attributes are Specify DEFAULT, FORMAT, TITLE, NULL,CASESPECIFIC, UPPERCASE.

Column Storage Attributes – Compress NULL values or a specified value.

Column-level Constraint Attributes – Specify the single column as a primary key or foreign key.
– Specify the single column as unique (must be NOT NULL).
– Specify constraint conditions on the column
Example :

CREATE SET TABLE emp ,NO FALLBACK ,
     (
      EmpNo SMALLINT FORMAT '999' NOT NULL,
      Name VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC,
      Salary DECIMAL(8,2),
      City VARCHAR(12) CHARACTER SET LATIN NOT CASESPECIFIC,
      State VARCHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC,
      doj DATE FORMAT 'YYYY-MM-DD')
UNIQUE PRIMARY INDEX ( EmpNo );

42. What is NULL?

What is NULL?

NULL represents the absence of a data value. This is neither space or zero.

NULL represents an unknown or nonexistent value.

NULL is an SQL keyword.

NULL is neither a data type nor a characteristic of data.

NULL columns may be COMPRESSed to occupy zero space in a row

41. SHOW VIEW

SHOW VIEW

Can be used to display the DDL for VIEW which was used to create that particular view

Syntax :

SHOW VIEW <view name>;

Exmaple :

Show view dept;

Results

CREATE VIEW dept

SELECT department_number

,department_name

,budget_amount

,manager_employee_number

FROM customer_service.department

WHERE department_number > 10;

40. Show Table

Show Table

Show Table is used to display the structure of any table and to generate create table script for that particular table

Example :

show table emp

The output will be,

CREATE

SET TABLE FINANCIAL.emp ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT(
EmpNo SMALLINT FORMAT '999' NOT NULL,
Name VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC,
Salary DECIMAL(8,2),
City VARCHAR(12) CHARACTER SET LATIN NOT CASESPECIFIC,
State VARCHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC,
doj DATE FORMAT 'YYYY-MM-DD')UNIQUE PRIMARY INDEX ( EmpNo );

39. What is BTEQ ?

Basic Teradata Query (BTEQ) is a Teradata SQL tool.

BTEQ is one of several front-end utilities for submitting SQL to the Teradata

BTEQ is Teradata client software that is installed on mainframe hosts or

network-attached clients. It operates under all host systems and local area

networks (LANs).

BTEQ contains additional commands for report formatting.

BTEQ commands:

Must be preceded by a period mark (.).

Provide an output listing (an audit trail) of what occurred.

May use the semicolon (;) following them.

BTEQ can be used as either an interactive or batch utility.
Starting BTEQ

To start BTEQ at a UNIX workstation or node that has the BTEQ utility

installed, type

bteq at the UNIX command line:
#bteq

Wednesday, 13 February 2013

38. Data Definition Language

Data Definition Language

Data Definition Language (DDL) is used by SQL

to create, modify, and remove object definitions:

DDL Statements:

Databases :

CREATE DATABASE

MODIFY DATABASE

DROP DATABASE

Users CREATE USER

MODIFY USER

DROP USER

Tables

CREATE TABLE

ALTER TABLE

DROP TABLE

Views

CREATE VIEW

REPLACE VIEW

DROP VIEW

Macros

CREATE MACRO

REPLACE MACRO

DROP MACRO

Indexes

CREATE INDEX

DROP INDEX

37. Formatting characters

FORMAT Characters

$ Fixed or floating dollar sign

9 Decimal digit (no zero suppress)

Z Zero-suppressed decimal digit

, Comma—inserted where specified

. Decimal point position

- Dash character—inserted where specified

/ Slash character—inserted where specified

% Percent character—inserted where specified

X Character data—each X represents one

character

G Graphic data—each G represents one logical

(double byte) character

B Blank data—insert a space at this point

36. Arithmetic Functions

Arithmetic Functions

Teradata supports the following arithmetic functions, which provide additional

functionality beyond ANSI SQL:

ABS—absolute value

Example :

Abs(5) = 5

Abs(-3) = 3

EXP—raises e to the power of <arg>

EXP(5) = 148.41

LOG—base 10 logarithm

LOG(10) = 1

4) LN —natural logarithm

LN(10) = 2.30

5) SQRT—square root

SQRT(25) = 5
SQRT(100) = 10

35. Time datatype

under construction

34. Graphic Data Types

Graphic Data Types

Teradata provides three data types for graphics data:

GRAPHIC

VARGRAPHIC

LONG VARGRAPHIC

Graphic data types support double byte graphic data for KanjiEBCDIC,

KanjiShift-JIS, and KanjiEUC character sets. Only multibyte characters

(MBCs) are stored in graphic data types, unlike character data types, which

can store a mix of MBCs and single byte characters (SBCs) when the

Teradata RBDMS is configured to support MBCs.

33.Interval Data Types

Interval Data Types

Teradata supports the following interval data types.

INTERVAL YEAR

INTERVAL MONTH

INTERVAL DAY

INTERVAL YEAR TO MONTH

INTERVAL HOUR

INTERVAL MINUTE

INTERVAL SECOND

INTERVAL DAY TO HOUR

INTERVAL DAY TO MINUTE

INTERVAL DAY TO SECOND

INTERVAL HOUR TO MINUTE

INTERVAL HOUR TO SECOND
· INTERVAL MINUTE TO SECOND

32. Date Time Data Types

Date/Time Data Types

DATE represents a calendar date and simplifies handling and formatting

date variables, common to business applications. DATE is stored as an

integer, using this formula:

(year - 1900) * 10000 + (month * 100) + day

Warning: The Teradata DATE is not compatible with the DATE defined in

Intermediate level ANSI SQL.

31. Data Types - Byte Data

Data Types—Byte Data

Teradata supports two data types for binary data:

BYTE is for fixed-length binary strings.
· VARBYTE is for variable-length strings.

30. Data Types - Character

Data Types—Character Data

ANSI Compliant

Two ANSI data types are available for character data:

CHAR is for fixed-length character strings.

VARCHAR is for variable-length strings. (Synonyms include CHAR

VARYING and CHARACTER VARYING.) It is intermediate-level ANSI,

Teradata Extensions

In addition to the CHAR and VARCHAR (CHAR VARYING) data types,

Teradata supports LONG VARCHAR, equivalent to VARCHAR (64000), the

largest variable-length string.

29. Arithmetic Operators

Arithmetic Operators

Teradata provides the following ANSI compliant arithmetic operators:

* multiply

/ divide

+ add

- subtract

+ positive value

- negative value

Teradata Extensions

Teradata supports two additional arithmetic operators that provide additional

functionality beyond ANSI SQL:

** (exponentiation) is used in the form: <n>**<arg>, where <n> is raised

to the power of <arg>.

Example: 4**3 = 4 * 4 * 4 = 64.

MOD is the modulo operator, which calculates the remainder in a division

operation.

Example: 60 MOD 7 = 4

(Since 60 divided by 7 equals 8, with a remainder of 4.)

Tuesday, 12 February 2013

28. Fallback Cluster

Fallback Cluster

cluster is a group of AMPs that act as a single Fallback unit. Clustering

has no effect on primary row distribution of the table, b ut the Fallback row

will always go to another AMP in the same cluster.

Cluster sizes are set through a Teradata console utility and may range

from 2 to 16 AMPs per cluster (not all clusters in a system have to be the

same size). The example shows an 8-AMP system set up in two clusters

of 4-AMPs each.

Should an AMP fail, the primary and Fallback row copies stored on that

AMP cannot be accessed. However, their alternate copies are available

through the other AMPs in the same cluster.

The loss of an AMP in one cluster has no effect upon other clusters. It is

possible to lose one AMP in each cluster and still have full access to all

Fallback-protected table data. If there are two AMP failures in the same

cluster, the entire Teradata system halts.

While an AMP is down, the remaining AMPs in the cluster must do their

own work plus the work of the down AMP. The larger the size of the

cluster, the less noticeable the workload increase is within that cluster

when one AMP fails. Large cluster sizes are more vulnerable to a second

failure before recovery from the first failure is complete. Remember that a

second failure halts the entire Teradata system.

If you had an 8-AMP system with 4 clusters of 2 AMPs each, the system

could lose four AMPs (one per cluster) and continue operations. If the

system workload is near capacity, there will be some loss of performance

27. Fallback

Fallback

Fallback

protects your data by storing a second copy of each row of a

table on an alternate, Fallback AMP in the same cluster.

If an AMP fails,the system accesses the Fallback rows to meet requests.

Fallback provides AMP fault tolerance at the table level. With Fallback tables, if one

AMP fails, all table data is still available. Users may continue to use

Fallback tables without any loss of available data.

During table creation or after a table is created, you may specify whether

or not the system should keep a Fallback copy of the table. If Fallback is

specified, it is automatic and transparent.

Fallback guarantees that the two copies of a row will always be on

different AMPs. If either AMP fails, the alternate row copy is still available

on the other AMP.

There is a benefit to protecting your data, but there are costs associated

with that benefit. With Fallback use, you need twice the disk space for

storage and twice the I/O for INSERTs, UPDATEs, and DELETEs. (The

Fallback option does not require any extra I/O for SELECT operations and

the Fallback I/O will be performed in parallel with the primary I/O.)

The benefits of Fallback include:

Protects your data from hardware (disk) failure.

Protects your data from software (node) failure.

Automatically recovers with minimum recovery time, after repairs orfixes are complete.

26. System components

Teradata System

The initial Teradata system comes with several users and databases:

User DBC

User Sysadmin

User SystemFE

Crashdumps database

Default, All, and Public Databases, and TDPuser

Initially, most of the space in the system belongs to the user DBC, with

some allocated to each of the users or databases listed above.

Database DBC contains all of the RDBMS software components and system tables

25. Teradata User

A Teradata User

user can be thought of as a collection of tables, views, macros, triggers,

and stored procedures.

A user is the same as a database except that a user can actually log on to

the RDBMS.

To logon, a user must have a

password. A user may or

may not have Perm Space.

A user could be considered an active

repository, because it is used to log on to the system as well as to store

other database objects.

Users can access other databases depending on the privileges they have

been granted.

24. Teradata Database

A Teradata Database

A database provides a logical grouping of information (tables, views, and

macros). A database could be considered a passive repository, because it

is used solely to store other database objects. This is different from a user,

which we will learn about on the next page.

Perm Space

All databases have a defined upper limit of

Permanent Space.

Permanent Space is used for storing the data rows of tables. Perm Space

is not pre-allocated. It represents a maximum limit.

Spool Space

All databases also have an upper limit of

Spool Space. If there is no limit

defined for a particular database or user, limits are inherited from parents.

Theoretically, a user could use all unallocated space in the system for

their query. Spool Space is Temporary Space used to hold intermediate

query results or formatted answer sets to queries. Once the query is

complete, the Spool Space is released.

Example
: You have a database with total disk space of 100GB. You have

10GB of user data and an additional 10GB of overhead. What is the

maximum amount of Spool Space available for queries?

Answer
: 80GB. All of the remaining space in the system is available for

spool.

Temp Space

The third type of space is

Temporary Space. Temp Space is used for

global temporary tables, and these results remain available to the user

until the session is terminated. Tables created in Temp Space will survive

a restart. Temp Space is permanent space currently not used.

23. Teradata Admin tools

Administrative Tools

Teradata Manager

Teradata Manager is a production and performance monitoring system

that simplifies the tasks of monitoring, controlling, and administering one

or more Teradata Databases.

With Teradata Manager you can use a variety of specially designed tools

and applications to gather, manipulate, and analyze information about

each Teradata system you need to administer.

It provides performance monitoring and database utilization statistics that

are collected, analyzed, and displayed. Teradata Manager can produce

its reports either in standard report format or in graphical format. Its

structure can be customized to configure it to address a wide variety of

issues that suit the needs of individuals or groups within your organization.

Teradata Manager runs on a PC that is network-connected to the

Teradata Database platform(s). It supports Windows 98, NT, 2000, and

XP.

Teradata Dynamic Query Manager (TDQM)

TDQM allows the Database Administrator to provide operational control of

and to effectively manage and regulate access to the Teradata Database.

The database administrator can use the following capabilities of TDQM to

manage work submitted to the database in order to maximize system

resource utilization:

Query Management

Scheduled Requests

With Query Management, database query requests are intercepted within

the Teradata database, their components are compared against criteria

that are defined by the administrator, and those requests that fail to meet

the criteria are rejected.

With Scheduled Requests, MS Windows clients can submit SQL requests

to be executed off-line at scheduled off-peak times.

22.Teradata Warehouse Builder

Teradata Warehouse Builder

Teradata Warehouse Builder (TWB) is a flexible, high performance data

warehouse loading tool that enables parallel data extraction,

transformation and loading processes common to all data warehouses.

TWB is scalable and enables end-to-end parallelism. The previous

versions of utilities (like FastLoad) allow you to load data into Teradata in

parallel using a single input stream. Teradata Warehouse Builder allows

you to run multiple instances of the extract, optional transformation, and

load. You can have as many loads as you have sources in the same job.

With multiple sources of data coming from multiple platforms, integration is

important in a parallel environment.

Teradata Warehouse Builder eliminates the need for persistent storage. It

stores data into data buffers so you no longer need to write data into a flat

file. Since you don’t need flat files, you no longer need to worry about a

2GB file limit.

Teradata Warehouse Builder provides a single scripting language. You

can do the extract, some transformation, and loads all in one SQL -like

scripting language. Once the dynamics of the language are learned, you

can perform multiple tasks with a single script.

You can use a script converter to convert existing scripts for utilities

(FastLoad, MultiLoad, FastExport and TPump) into Teradata Warehouse

Builder scripts.

Teradata Warehouse Builder supports FastLoad INMODs, FastExport

OUTMODs, and Access Modules to provide access to all the data sources

you use today.

21. TPump Utility

TPump Utility

TPump (Teradata Parallel Data Pump) is a utility that provides highvolume

batch maintenance of large Teradata Databases.

It enables

acquisition of data from the client with low processor utilization.

The Support Environment enhances TPump's functionality. In addition to

coordinating activities involved in TPump tasks, it provides facilities for

managing file acquisition, conditional processing, and certain DML (Data

Manipulation Language) and DDL (Data Definition Language) activities on

the Teradata Database.

The Support Environment enables an additional level of user control over TPump.

TPump uses row-hash locks, making concurrent updates on the same

table a possibility.

TPump has a built-in resource governing facility that allows the operator to

specify how many updates occur (the statement rate) minute by minute,

then change the statement rate while the job continues running. This

utility can be used to increase the statement rate during windows when

TPump is running by itself, then decrease the statement rate later on if

users log on for ad-hoc query access.

TPump can always be stopped and all work will be committed as run.

TPump does not utilize one of the loader slots used by FastLoad,

MultiLoad, and FastExport, so many TPump jobs can run concurrently.

The slide on the opposite page identifies the principal features of the

TPump utility.

20.FastExport Utility

FastExport Utility

FastExport is a command-driven utility that uses multiple sessions to

quickly transfer large amounts of data from tables and views on the

Teradata Database to a client-based application.

You can export data from any table or view to which you have the

SELECT access privilege. The destination for the exported data can be:

A file on your channel-attached or network-attached client system.

An Output Modification (OUTMOD) routine you write to select, validate,

and preprocess the exported data.

How it Works ?

FastExport processes a series of FastExport commands and Teradata

SQL statements you enter, usually as a batch mode job script. The

FastExport commands provide the session control and data handling

specifications for the data transfer operations. The Teradata SQL

statements perform the actual data export functions on the Teradata

tables and views.

What it Does ?

When you invoke FastExport, the utility executes the FastExport

commands and Teradata SQL statements in your FastExport job script.

These direct the FastExport utility to:

1. Log you on to Teradata for a specified number of sessions, using your

username, password and tdpid/acctid information.

2. Retrieve the specified data from Teradata, in accordance with your

format and selection specifications.

3. Export the data to the specified file or OUTMOD routine on your client

system.

4. Log you off of Teradata.

19. MultiLoad Utility

MultiLoad Utility

The MultiLoad utility gives you an efficient way to deal with batch

maintenance of large databases. MultiLoad is a command -driven utility

you can use to do fast, high-volume maintenance on multiple tables and

views of a Teradata Database. Using a single MultiLoad job, you can do a

number of different import and delete tasks on database tables and views:

Each MultiLoad import task can do multiple data insert, update, and

delete functions on up to five different tables or views.

Each MultiLoad delete task can remove large numbers of rows from a

single table. You can use MultiLoad to import data from:

Disk or tape files on a channel-attached client system

Input files on a network-attached workstation

Special input module (INMOD) programs you write to select,

validate, and preprocess input data

Access modules

Any device providing properly formatted source data

The table or view in the database receiving the data can be any existing

table or view for which you have access privileges for the maintenance

tasks you want to do.

How it Works ?

MultiLoad processes a series of MultiLoad commands and Teradata SQL

statements you enter usually as a batch mode job script. You use the

MultiLoad commands for session control and data handling of the data

transfers. The Teradata SQL statements do the actual maintenance

functions on the database tables and views.

17. Teradata FastLoad Utility

FastLoad Utility

FastLoad is a command-driven utility you can use to quickly load large

amounts of data in an empty table on a Teradata Database.

You can load data from:

Disk or tape files on a channel-attached client system

Input files on a network-attached workstation

Special Access module (AXSMOD) or input module (INMOD) routines

you write to select, validate, and preprocess input data

Any other device providing properly formatted source data

FastLoad uses multiple sessions to load data. It loads data into a single

table on a Teradata system per job. If you want to load data into more

than one table in the system, you can submit multiple FastLoad jobs-one

for each table.

How It Works

FastLoad processes a series of FastLoad commands and Teradata SQL

statements you enter either interactively or in batch mode. You use the

FastLoad commands for session control and data handling of the data

transfers. The Teradata SQL statements create, maintain and drop tables

on the Teradata Database.

During a load operation, FastLoad inserts the data from each record of

your data source into one row of the table on a Teradata Database. The

table on the Teradata system receiving the data must be empty and have

no defined secondary indexes.

What It Does

When you invoke FastLoad, the utility executes the FastLoad commands

and Teradata SQL statements in your FastLoad job script. These direct

the FastLoad utility to:

1. Log you on to Teradata for a specified number of sessions, using your

username, password and tdpid/acctid information.

2. Load the input data into the FastLoad table on the Teradata system.

3. Log you off from Teradata.

16. Teradata SQL Tools

SQL Tools

Teradata provides two front-end SQL query products.

BTEQ

BTEQ stands for Basic Teradata Query utility. It is an SQL front-end that

runs on all client platforms. It supports both interactive ad hoc and

scripted batch queries and provides standard report writing and formatting

functionality. It also provides a basic import/export capability. Other

utilities are available for these functions, particularly when larger tables

are involved. When a user logs on with BTEQ, they can set the session

number they need before logging on. Using multiple sessions may

improve performance for certain operations.

Teradata SQL Assistant

Teradata SQL Assistant (formerly called

Queryman) is an SQL front-end

for Teradata, as well as any other ODBC compliant database. It offers a

full menu of services including query history, timings, status, row counts,

random sampling, and limited answer sets. It provides an import/export

feature between database and PC and also allows export to Excel and

Access.

14. Teradata Data Dictionary

The Data Dictionary (DD)

The Data Dictionary (DD) is an integrated set of system tables that:

Stores database object definitions and accumulates information about
users
, databases, resource usage, data demographics, and
security rules
.

Records specifications about tables, views, and macros.

Contains information about ownership, space allocation,
accounting
, and access rights (privileges) for these objects.

Data Dictionary information is updated automatically during the processing

of Teradata SQL

data definition language (DDL) statements. It is used

by the Parser to obtain information needed to process all Teradata SQL

statements.

Users may access the Data Dictionary through Teradata-supplied views, if
permitted by the System Administrator.

15. Teradata SQL

Structured Query Language (SQL)

Structured Query Language (SQL) is the language used to access the

database. It is sometimes referred to as a Fourth Generation Language

(4GL) to differentiate it from Third Generation Languages such as

FORTRAN and COBOL, though it is quite different from other 4GLs. It

acts as an intermediary between the user and the database. SQL defines

the answer set that is returned from the database.

SQL is different from other computer languages. Its statements resemble

English-like structures. It provides powerful, set-oriented database

manipulation including

structural modification, data retrieval,

modification

, and security functions.

There are four important subsets of SQL: the Data Definition Language,

the Data Manipulation Language, Data Control Language and Stored

Procedure Language.

Data Definition Language (DDL)

DDL allows a user to define the

database objects and the relationships

that exist among them

. Examples of DDL uses are creating or modifying

tables and views.

Data Manipulation Language (DML)

DML consists of the

statements that manipulate, change, or retrieve

the data rows

of the database. If the DDL defines the database, the DML

lets the user change the information contained in the database. The DML

is the most commonly used subset of SQL. It is used to select, update,

delete, and insert rows.

Data Control Language (DCL)

DCL is used to

restrict or permit a user's access. It can selectively limit

a user's ability to retrieve, add, or modify data. It is used to grant and

revoke access privileges on tables and views.

Stored Procedure Language (SPL)

SPL is used to

create and call stored procedures in Teradata.

Teradata Extensions to SQL

User Assistance commands allow you to list the objects in a database or

the characteristics of a table, see how a query will execute, or show the

details of your system. They vary widely from vendor to vendor.

13. Teradata objects

Teradata Objects

database in the Teradata system is a collection of objects known as
tables
, views, macros, triggers and stored procedures. Databases

provide a

logical grouping for information. They are also the

foundation for space allocation and access control.

1) Tables

A table is the logical structure of data in an RDBMS. It is a

twodimensional

structure made up of columns and rows

. A user defines

a table by giving it a table name that refers to the type of data that will be

stored in the table (e.g., an Employee table stores data about employees.)

column represents attributes of the table. Column names are given to

each column of the table. All information in a column is of the same type.

For example, a column named Date of Birth would only hold date of birth

information. Each occurrence of an entity is stored in the table as a

row.

Entities are the people, places, things, or events that the table is

describing. Tables require Permanent Space to store rows.

2) Views

A view is a

pre-defined subset of one or more tables or other views. It

does not exist as a real table, but serves as a reference to existing tables

or views. One way to think of a view is as a virtual table. Views have

definitions in the Data Dictionary (DD) but do not contain any physical

rows. Views can be used to control access to the underlying tables.

Views can be used to hide columns from users, to insulate applications

from database changes, and to standardize or simplify access techniques.

Views do not require Permanent Space.

3) Macros

A macro is a predefined, stored set of one or more SQL commands and

report-formatting commands. Macros are used to simplify the execution of

frequently used SQL commands. Macros do not require Permanent

Space.

4) Triggers

A trigger is an event-driven procedure attached to a table. A trigger

defines events that happen when some other event, called a

triggering

event

, occurs. A trigger consists of one or more SQL statements that are

associated with a table and are executed when the trigger is fired. A

trigger is created with the CREATE TRIGGER statement.

5) Stored Procedures

A stored procedure is a pre-defined set of statements invoked through a

single SQL CALL statement. Stored procedures may contain both

Teradata SQL statements and procedural statements (in Teradata,

referred to as Stored Procedure Language, or SPL). Macros are similar to

stored procedures but do not contain SPL.

12. Teradata Client tools

Teradata Client Software

The Teradata Database requires three distinct pieces of software: TPA,

PDE, and OS.

An operating system (OS), UNIX or Windows 2000, and a Teradata

software license are necessary for each node.

A Trusted Parallel Application (TPA) implements virtual processors and

runs on the OS with PDE. The Teradata Databse is classified as a TPA.

The components of the Teradata Database software include:

Channel Driver

Teradata Gateway

AMP

Parallel Database Extensions (PDE) were added to the operating system

to support the parallel software environment.

On UNIX, a virtual processor (vproc) is a collection of software processes

running under the multi-tasking environment of the UNIX operating

system.

On Windows 2000, a vproc is a software process that runs under the

multi-tasking environment of the Windows 2000 operating system.

The client may be a mainframe system, such as IBM or Amdahl, which is

channel-attached to the Teradata Database, or it may be a PC or UNIX -

based system that is LAN-attached.

The client application submits an SQL request to the RDBMS, receives

the response, and submits the response to the user.

11. What is AMP ?

The Access Module Processor (AMP)

The

Access Module Processor (AMP) is the virtual processor (vproc) in

Teradata's shared-nothing architecture that is responsible for managing a

portion of the database. An AMP will control some portion of each table

on the system. AMPs do the physical work associated with generating an

answer set, including sorting, aggregating, formatting and converting. The

AMPs perform all database management functions in the system.

An AMP responds to Parser/Optimizer steps transmitted across the

BYNET by selecting data from or storing data to its disks. For some

requests, the AMPs may redistribute a copy of the data to other AMPs.

The

Database Manager subsystem resides on each AMP. This

subsystem will:

Lock databases and tables.

Create, modify, or delete definitions of tables.

Insert, delete, or modify rows within the tables.

Retrieve information from definitions and tables.

Return responses to the Dispatcher.

The database manager subsystem provides a bridge between that

logical organization and the physical organization of the data on disks.

The Database Manager performs a space-management function that

controls the use and allocation of space.

Teradata performs all tasks in parallel, providing exceptional performance.

The greater the number of tasks processed in parallel, the better the

system performance. Many databases call themselves "parallel", but they

can only perform some tasks in parallel.

10. What is BYNET ?

BYNET

The

BYNET handles the internal communication of the Teradata

Database. All communication between PEs and AMPs is done via the

BYNET.

When the PE dispatches the steps for the AMPs to perform, they are

dispatched onto the BYNET. The messages are routed to the appropriate

AMP(s) where results sets and status information are generated. This

response information is also routed back to the requesting PE via the

BYNET.

Depending on the nature of the dispatch request, the communication

between nodes may be a:

Broadcast—message is routed to all nodes in the system.

Point-to-point—message is routed to one specific node in the

system.

Once the message is on a participating node, PDE directs the message to

the appropriate AMPs on that node. All AMPs receive a broadcast

message. With a point-to-point or multicast (multiple AMPs) message, the

message is directed only to the appropriate AMP(s) on that node.

So, while a Teradata system does do multicast messaging, the BYNET

hardware alone cannot do it - the BYNET can only do point-to-point and

broadcast between nodes.

The BYNET has several unique features:

Fault tolerant:
each network has multiple connection paths. If the BYNET

detects an unusable path in either network, it will automatically reconfigure

that network so all messages avoid the unusable path. Additionally, in the

rare case that BYNET 0 cannot be reconfigured, hardware on BYNET 0 is

disabled and messages are re-routed to BYNET 1, and vice versa.

Load balanced:
traffic is automatically and dynamically distributed

between both BYNETs.

Scalable:
as you add nodes to the system, overall network bandwidth

scales linearly - meaning an increase in system size without loss of

performance.

High Performance:
an MPP system has two BYNET networks. Because

both networks are active, the system benefits from the full aggregate

bandwidth.

The technology of the BYNET is what makes the Teradata parallelism
possible.

9.Parsing Engine

The Parsing Engine

Parsing Engine (PE) is a virtual processor (vproc). It is made up of the

following software components: Session Control, the Parser, the

Optimizer, and the Dispatcher.

Once a valid session has been established, the PE is the component that

manages the dialogue between the client application and the RDBMS.

Each PE can handle up to 120 sessions, and each session can handle

multiple requests.

Session Control

The major functions performed by Session Control are logon and logoff.

Logon takes a textual request for session authorization, verifies it, and

returns a yes or no answer. Logoff terminates any ongoing activity and

deletes the session’s context.

Parser

The Parser interprets SQL statements, checks them for proper SQL

syntax and evaluates them semantically. The PE also consults the Data

Dictionary to ensure that all objects and columns exist and that the user

has authority to access these objects.

Optimizer

The

Optimizer is responsible for developing the least expensive plan to

return the requested response set. Processing alternatives are evaluated

and the fastest plan is chosen. This plan is converted to executable steps,

to be performed by the AMPs, which are then passed to the dispatcher. In

order to maximize throughput and minimize resource contention, the

optimizer must know about

system configuration, available units of

parallelism

(AMPs and PE's), and data demographics. The Teradata

Optimizer is robust and intelligent. The optimizer enables Teradata to

handle multiple complex, ad hoc queries efficiently. It is

parallel-aware

and

cost-based and uses full look-ahead capability.

Dispatcher

The Dispatcher controls the sequence in which the steps are executed

and passes the steps on to the BYNET. It is composed of executioncontrol

and response-control tasks. Execution control receives the step

definitions from the Parser and transmits them to the appropriate AMP(s)

for processing, receives status reports from the AMPs as they process the

steps, and passes the results on to response control once the AMPs have

completed processing. Response control returns the results to the user.

The Dispatcher sees that all AMPs have finished a step before the next

step is dispatched. Depending on the nature of the SQL request, a step

will be sent to one AMP, or broadcast to all AMPs.