Wednesday, 19 June 2013

Leaaion 46 : Subquery




Subquery 

Subquery  is a query whithin  in a query. It is also called as Inner query or Nested query. A subquery is usually added in the WHERE clause of the sql statement.  Usually, a subquery is used when we know how to search for a value using a SELECT statement, but do not know the exact value.


Rules for Subqueries


· Must be enclosed in parentheses


· Can be the object of an IN or NOT IN clause

· Can be the object of EXISTS or NOT EXISTS clause

· Support quantifiers ALL, ANY, SOME

· Support LIKE or NOT LIKE used with a quantifier

· Can specify more than one column to match

· Generate a DISTINCT list of values


· Cannot use ORDER BY (within the subquery
statement).

·
Keep in mind that a maximum of 64
tables/views can be specified in an SQL
statement.

Thursday, 14 February 2013

45. INSERT in Teradata

INSERT

We  can use INSERT to add a new row to a table.

Syntax :

INSERT INTO table_name [(col_name [, ..., col_name])]  VALUES (expression [, ..., expression]) ;

Example :

INSERT INTO emp(empid, ename) VALIES(101. 'Arun Kumar');

44. DML in Teradata


DML - Data Manipulation Language,

Following are DML statements are available in Teradata,

INSERT - Inserts a single row into a table.
INSERT SELECT - Inserts one or more rows into a table using values selected from other tables.
UPDATE - Changes column values in existing rows of a table.
DELETE -Removes rows from a table.

43. CREATE TABLE in Teradata

CREATE TABLE is DDL command which is used to create teradata tables.

When executed, the CREATE TABLE statement creates and stores the table structure definition in the Teradata Data Dictionary.

The CREATE TABLE statement includes:

·
Create Table options

·
Column definitions

·
Table-level constraints

·
Index definitions

Syntax :

CREATE <SET/MULTISET> TABLE employee
<Create Table Options>
<Column Definitions>
<Table-level Constraints>
<Index Definitions>;
Where,

SET  - No duplicate rows allowed

MULTISET -duplicate rows allowed

Table options are,
Fallback  - copy of data is maintained by another AMP
No Fallback  - No copy of data.
Journaling - Keep single/dual copy of row BEFORE/AFTER updates to the row
Freespace - % of freespace (0-75%) that will remain on a cylinder during subsequent loading operations
Datablocksize - Size of data block in which row(s) of the table are stored

Column definitions

Column name - Name the column
Data type – Declare the column to be a character, byte,numeric, or graphic data type.

Data type Attributes are  Specify DEFAULT, FORMAT, TITLE, NULL,CASESPECIFIC, UPPERCASE.

Column Storage Attributes – Compress NULL values or a specified value.

Column-level Constraint Attributes – Specify the single column as a primary key or foreign key.
– Specify the single column as unique (must be NOT NULL).
– Specify constraint conditions on the column
Example :

CREATE SET TABLE emp ,NO FALLBACK ,
     (
      EmpNo SMALLINT FORMAT '999' NOT NULL,
      Name VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC,
      Salary DECIMAL(8,2),
      City VARCHAR(12) CHARACTER SET LATIN NOT CASESPECIFIC,
      State VARCHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC,
      doj DATE FORMAT 'YYYY-MM-DD')
UNIQUE PRIMARY INDEX ( EmpNo );

42. What is NULL?

What is NULL?

NULL represents the absence of a data value. This is neither space or zero.

·
NULL represents an unknown or nonexistent value.

·
NULL is an SQL keyword.

·
NULL is neither a data type nor a characteristic of data.

·
NULL columns may be COMPRESSed to occupy zero space in a row

41. SHOW VIEW

SHOW VIEW

Can be used to display the DDL for VIEW which was used to create that particular view

Syntax :

SHOW VIEW <view name>;

Exmaple :

Show view dept;

Results

CREATE VIEW dept
AS
SELECT department_number
,department_name
,budget_amount
,manager_employee_number
FROM customer_service.department
WHERE department_number > 10;

40. Show Table

Show Table


Show Table is used to display the structure of any table  and to generate create table script for that particular table


Example :

show table emp

The output will be,

CREATE
SET TABLE FINANCIAL.emp ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT(
EmpNo SMALLINT FORMAT '999' NOT NULL,
Name VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC,
Salary DECIMAL(8,2),
City VARCHAR(12) CHARACTER SET LATIN NOT CASESPECIFIC,
State VARCHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC,
doj DATE FORMAT 'YYYY-MM-DD')UNIQUE PRIMARY INDEX ( EmpNo );

39. What is BTEQ ?

Basic Teradata Query (BTEQ) is a Teradata SQL tool.

BTEQ is one of several front-end utilities for submitting SQL to the Teradata


BTEQ is Teradata client software that is installed on mainframe hosts or
network-attached clients. It operates under all host systems and local area
networks (LANs).

BTEQ contains additional commands for report formatting.

BTEQ commands:

· 
Must be preceded by a period mark (.).

·
Provide an output listing (an audit trail) of what occurred.

·
May use the semicolon (;) following them.


BTEQ can be used as either an interactive or batch utility.
Starting BTEQ

To start BTEQ at a UNIX workstation or node that has the BTEQ utility
installed, type
bteq at the UNIX command line:
#bteq

Wednesday, 13 February 2013

38. Data Definition Language

Data Definition Language

Data Definition Language (DDL) is used by SQL
to create, modify, and remove object definitions:

DDL Statements:

Databases :
    CREATE DATABASE
    MODIFY DATABASE
    DROP DATABASE

Users CREATE USER
    MODIFY USER
    DROP USER

Tables
      CREATE TABLE
      ALTER TABLE
      DROP TABLE

Views
    CREATE VIEW
    REPLACE VIEW
    DROP VIEW

Macros
   CREATE MACRO
   REPLACE MACRO
   DROP MACRO

Indexes
    CREATE INDEX
    DROP INDEX

37. Formatting characters

FORMAT Characters

$ Fixed or floating dollar sign

9 Decimal digit (no zero suppress)

Z Zero-suppressed decimal digit

, Comma—inserted where specified

. Decimal point position

- Dash character—inserted where specified

/ Slash character—inserted where specified

% Percent character—inserted where specified

X Character data—each X represents one
character

G Graphic data—each G represents one logical
(double byte) character

B Blank data—insert a space at this point

36. Arithmetic Functions

Arithmetic Functions

Teradata supports the following arithmetic functions, which provide additional
functionality beyond ANSI SQL:

1) 
ABS—absolute value

Example :

Abs(5) = 5
Abs(-3)  = 3

2) 
EXP—raises e to the power of <arg>

EXP(5) = 148.41


3) 
LOG—base 10 logarithm

     LOG(10) = 1

4) LN —natural logarithm

    LN(10) = 2.30

5) SQRT—square root

 SQRT(25) = 5
 SQRT(100) = 10


35. Time datatype

under construction

34. Graphic Data Types

Graphic Data Types
 
Teradata provides three data types for graphics data:

·
GRAPHIC

·
VARGRAPHIC

·
LONG VARGRAPHIC

Graphic data types support double byte graphic data for KanjiEBCDIC,
KanjiShift-JIS, and KanjiEUC character sets. Only multibyte characters
(MBCs) are stored in graphic data types, unlike character data types, which
can store a mix of MBCs and single byte characters (SBCs) when the
Teradata RBDMS is configured to support MBCs.

33.Interval Data Types

Interval Data Types

Teradata supports the following interval data types.

·
INTERVAL YEAR

·
INTERVAL MONTH

·
INTERVAL DAY

·
INTERVAL YEAR TO MONTH

·
INTERVAL HOUR

·
INTERVAL MINUTE

·
INTERVAL SECOND

·
INTERVAL DAY TO HOUR

·
INTERVAL DAY TO MINUTE

·
INTERVAL DAY TO SECOND

·
INTERVAL HOUR TO MINUTE

·
INTERVAL HOUR TO SECOND
·
INTERVAL MINUTE TO SECOND

32. Date Time Data Types

Date/Time Data Types

DATE represents a calendar date and simplifies handling and formatting
date variables, common to business applications. DATE is stored as an
integer, using this formula:

(year - 1900) * 10000 + (month * 100) + day

Warning: The Teradata DATE is not compatible with the DATE defined in
Intermediate level ANSI SQL.

31. Data Types - Byte Data

Data Types—Byte Data
 
Teradata supports two data types for binary data:

·
BYTE is for fixed-length binary strings.
·
VARBYTE is for variable-length strings.

30. Data Types - Character

Data Types—Character Data

ANSI Compliant

Two ANSI data types are available for character data:

·
CHAR is for fixed-length character strings.

·
VARCHAR is for variable-length strings. (Synonyms include CHAR
VARYING and CHARACTER VARYING.) It is intermediate-level ANSI,

Teradata Extensions

In addition to the CHAR and VARCHAR (CHAR VARYING) data types,
Teradata supports LONG VARCHAR, equivalent to VARCHAR (64000), the
largest variable-length string.

29. Arithmetic Operators

Arithmetic Operators

Teradata provides the following ANSI compliant arithmetic operators:

* multiply
/ divide
+ add
- subtract
+ positive value
- negative value

Teradata Extensions

Teradata supports two additional arithmetic operators that provide additional
functionality beyond ANSI SQL:

·
** (exponentiation) is used in the form: <n>**<arg>, where <n> is raised
to the power of <arg>.

Example: 4**3 = 4 * 4 * 4 = 64.

·
MOD is the modulo operator, which calculates the remainder in a division
operation.

Example: 60 MOD 7 = 4
(Since 60 divided by 7 equals 8, with a remainder of 4.)


Tuesday, 12 February 2013

28. Fallback Cluster

Fallback Cluster

A
cluster is a group of AMPs that act as a single Fallback unit. Clustering
has no effect on primary row distribution of the table, b ut the Fallback row
will always go to another AMP in the same cluster.

Cluster sizes are set through a Teradata console utility and may range
from 2 to 16 AMPs per cluster (not all clusters in a system have to be the
same size). The example shows an 8-AMP system set up in two clusters
of 4-AMPs each.

Should an AMP fail, the primary and Fallback row copies stored on that
AMP cannot be accessed. However, their alternate copies are available
through the other AMPs in the same cluster.

The loss of an AMP in one cluster has no effect upon other clusters. It is
possible to lose one AMP in each cluster and still have full access to all
Fallback-protected table data. If there are two AMP failures in the same
cluster, the entire Teradata system halts.

While an AMP is down, the remaining AMPs in the cluster must do their
own work plus the work of the down AMP. The larger the size of the
cluster, the less noticeable the workload increase is within that cluster
when one AMP fails. Large cluster sizes are more vulnerable to a second
failure before recovery from the first failure is complete. Remember that a
second failure halts the entire Teradata system.

If you had an 8-AMP system with 4 clusters of 2 AMPs each, the system
could lose four AMPs (one per cluster) and continue operations. If the
system workload is near capacity, there will be some loss of performance

27. Fallback

Fallback

Fallback
protects your data by storing a second copy of each row of a
table on an alternate, Fallback AMP in the same cluster.

If an AMP fails,the system accesses the Fallback rows to meet requests.

Fallback provides AMP fault tolerance at the table level. With Fallback tables, if one
AMP fails, all table data is still available. Users may continue to use
Fallback tables without any loss of available data.

During table creation or after a table is created, you may specify whether
or not the system should keep a Fallback copy of the table. If Fallback is
specified, it is automatic and transparent.

Fallback guarantees that the two copies of a row will always be on
different AMPs. If either AMP fails, the alternate row copy is still available
on the other AMP.

There is a benefit to protecting your data, but there are costs associated
with that benefit. With Fallback use, you need twice the disk space for
storage and twice the I/O for INSERTs, UPDATEs, and DELETEs. (The
Fallback option does not require any extra I/O for SELECT operations and
the Fallback I/O will be performed in parallel with the primary I/O.)

The benefits of Fallback include:

·
Protects your data from hardware (disk) failure.

·
Protects your data from software (node) failure.

·
Automatically recovers with minimum recovery time, after repairs orfixes are complete.

26. System components

Teradata System

The initial Teradata system comes with several users and databases:

-
User DBC

-
User Sysadmin

-
User SystemFE

-
Crashdumps database

-
Default, All, and Public Databases, and TDPuser

Initially, most of the space in the system belongs to the user DBC, with
some allocated to each of the users or databases listed above.

Database DBC contains all of the RDBMS software components and system tables

25. Teradata User

A Teradata User

A
user can be thought of as a collection of tables, views, macros, triggers,
and stored procedures.

A user is the same as a database except that a user can actually log on to
the RDBMS.

To logon, a user must have a
password. A user may or
may not have Perm Space.

A user could be considered an active
repository, because it is used to log on to the system as well as to store
other database objects.

Users can access other databases depending on the privileges they have
been granted.

24. Teradata Database

A Teradata Database

A database provides a logical grouping of information (tables, views, and
macros). A database could be considered a passive repository, because it
is used solely to store other database objects. This is different from a user,
which we will learn about on the next page.

Perm Space
All databases have a defined upper limit of
Permanent Space.
Permanent Space is used for storing the data rows of tables. Perm Space
is not pre-allocated. It represents a maximum limit.

Spool Space
All databases also have an upper limit of
Spool Space. If there is no limit
defined for a particular database or user, limits are inherited from parents.
Theoretically, a user could use all unallocated space in the system for
their query. Spool Space is Temporary Space used to hold intermediate
query results or formatted answer sets to queries. Once the query is
complete, the Spool Space is released.

Example
: You have a database with total disk space of 100GB. You have
10GB of user data and an additional 10GB of overhead. What is the
maximum amount of Spool Space available for queries?

Answer
: 80GB. All of the remaining space in the system is available for
spool.

Temp Space
The third type of space is
Temporary Space. Temp Space is used for
global temporary tables, and these results remain available to the user
until the session is terminated. Tables created in Temp Space will survive
a restart. Temp Space is permanent space currently not used.

23. Teradata Admin tools

Administrative Tools

Teradata Manager
Teradata Manager is a production and performance monitoring system
that simplifies the tasks of monitoring, controlling, and administering one
or more Teradata Databases.
With Teradata Manager you can use a variety of specially designed tools
and applications to gather, manipulate, and analyze information about
each Teradata system you need to administer.
It provides performance monitoring and database utilization statistics that
are collected, analyzed, and displayed. Teradata Manager can produce
its reports either in standard report format or in graphical format. Its
structure can be customized to configure it to address a wide variety of
issues that suit the needs of individuals or groups within your organization.
Teradata Manager runs on a PC that is network-connected to the
Teradata Database platform(s). It supports Windows 98, NT, 2000, and
XP.

Teradata Dynamic Query Manager (TDQM)
TDQM allows the Database Administrator to provide operational control of
and to effectively manage and regulate access to the Teradata Database.
The database administrator can use the following capabilities of TDQM to
manage work submitted to the database in order to maximize system
resource utilization:

·
Query Management

·
Scheduled Requests

With Query Management, database query requests are intercepted within
the Teradata database, their components are compared against criteria
that are defined by the administrator, and those requests that fail to meet
the criteria are rejected.

With Scheduled Requests, MS Windows clients can submit SQL requests
to be executed off-line at scheduled off-peak times.

22.Teradata Warehouse Builder

Teradata Warehouse Builder

Teradata Warehouse Builder (TWB) is a flexible, high performance data
warehouse loading tool that enables parallel data extraction,
transformation and loading processes common to all data warehouses.

TWB is scalable and enables end-to-end parallelism. The previous
versions of utilities (like FastLoad) allow you to load data into Teradata in
parallel using a single input stream. Teradata Warehouse Builder allows
you to run multiple instances of the extract, optional transformation, and
load. You can have as many loads as you have sources in the same job.
With multiple sources of data coming from multiple platforms, integration is
important in a parallel environment.

Teradata Warehouse Builder eliminates the need for persistent storage. It
stores data into data buffers so you no longer need to write data into a flat
file. Since you don’t need flat files, you no longer need to worry about a
2GB file limit.

Teradata Warehouse Builder provides a single scripting language. You
can do the extract, some transformation, and loads all in one SQL -like
scripting language. Once the dynamics of the language are learned, you
can perform multiple tasks with a single script.

You can use a script converter to convert existing scripts for utilities
(FastLoad, MultiLoad, FastExport and TPump) into Teradata Warehouse
Builder scripts.

Teradata Warehouse Builder supports FastLoad INMODs, FastExport
OUTMODs, and Access Modules to provide access to all the data sources
you use today.

21. TPump Utility

TPump Utility

TPump (Teradata Parallel Data Pump) is a utility that provides highvolume
batch maintenance of large Teradata Databases.
It enables
acquisition of data from the client with low processor utilization.
The Support Environment enhances TPump's functionality. In addition to
coordinating activities involved in TPump tasks, it provides facilities for
managing file acquisition, conditional processing, and certain DML (Data
Manipulation Language) and DDL (Data Definition Language) activities on
the Teradata Database.

The Support Environment enables an additional level of user control over TPump.

TPump uses row-hash locks, making concurrent updates on the same
table a possibility.

TPump has a built-in resource governing facility that allows the operator to
specify how many updates occur (the statement rate) minute by minute,
then change the statement rate while the job continues running. This
utility can be used to increase the statement rate during windows when
TPump is running by itself, then decrease the statement rate later on if
users log on for ad-hoc query access.

TPump can always be stopped and all work will be committed as run.
TPump does not utilize one of the loader slots used by FastLoad,
MultiLoad, and FastExport, so many TPump jobs can run concurrently.
The slide on the opposite page identifies the principal features of the
TPump utility.

20.FastExport Utility

FastExport Utility
 
FastExport is a command-driven utility that uses multiple sessions to
quickly transfer large amounts of data from tables and views on the
Teradata Database to a client-based application.

You can export data from any table or view to which you have the
SELECT access privilege. The destination for the exported data can be:

·
A file on your channel-attached or network-attached client system.

·
An Output Modification (OUTMOD) routine you write to select, validate,
and preprocess the exported data.

How it Works ?

FastExport processes a series of FastExport commands and Teradata
SQL statements you enter, usually as a batch mode job script. The
FastExport commands provide the session control and data handling
specifications for the data transfer operations. The Teradata SQL
statements perform the actual data export functions on the Teradata
tables and views.

What it Does ?

When you invoke FastExport, the utility executes the FastExport
commands and Teradata SQL statements in your FastExport job script.

These direct the FastExport utility to:

1. Log you on to Teradata for a specified number of sessions, using your
username, password and tdpid/acctid information.

2. Retrieve the specified data from Teradata, in accordance with your
format and selection specifications.

3. Export the data to the specified file or OUTMOD routine on your client
system.

4. Log you off of Teradata.

19. MultiLoad Utility

MultiLoad Utility

The MultiLoad utility gives you an efficient way to deal with batch
maintenance of large databases. MultiLoad is a command -driven utility
you can use to do fast, high-volume maintenance on multiple tables and
views of a Teradata Database. Using a single MultiLoad job, you can do a
number of different import and delete tasks on database tables and views:

·
Each MultiLoad import task can do multiple data insert, update, and
delete functions on up to five different tables or views.

·
Each MultiLoad delete task can remove large numbers of rows from a
single table. You can use MultiLoad to import data from:

·
Disk or tape files on a channel-attached client system

·
Input files on a network-attached workstation

·
Special input module (INMOD) programs you write to select,
validate, and preprocess input data

·
Access modules

·
Any device providing properly formatted source data
The table or view in the database receiving the data can be any existing
table or view for which you have access privileges for the maintenance
tasks you want to do.

How it Works ?

MultiLoad processes a series of MultiLoad commands and Teradata SQL
statements you enter usually as a batch mode job script. You use the
MultiLoad commands for session control and data handling of the data
transfers. The Teradata SQL statements do the actual maintenance
functions on the database tables and views.

17. Teradata FastLoad Utility

FastLoad Utility

FastLoad is a command-driven utility you can use to quickly load large
amounts of data in an empty table on a Teradata Database.

You can load data from:

·
Disk or tape files on a channel-attached client system
·
Input files on a network-attached workstation
·
Special Access module (AXSMOD) or input module (INMOD) routines
you write to select, validate, and preprocess input data
·
Any other device providing properly formatted source data

FastLoad uses multiple sessions to load data. It loads data into a single
table on a Teradata system per job. If you want to load data into more
than one table in the system, you can submit multiple FastLoad jobs-one
for each table.

How It Works
FastLoad processes a series of FastLoad commands and Teradata SQL
statements you enter either interactively or in batch mode. You use the
FastLoad commands for session control and data handling of the data
transfers. The Teradata SQL statements create, maintain and drop tables
on the Teradata Database.
During a load operation, FastLoad inserts the data from each record of
your data source into one row of the table on a Teradata Database. The
table on the Teradata system receiving the data must be empty and have
no defined secondary indexes.

What It Does
When you invoke FastLoad, the utility executes the FastLoad commands
and Teradata SQL statements in your FastLoad job script. These direct
the FastLoad utility to:

1. Log you on to Teradata for a specified number of sessions, using your
username, password and tdpid/acctid information.

2. Load the input data into the FastLoad table on the Teradata system.

3. Log you off from Teradata.

16. Teradata SQL Tools


SQL Tools
Teradata provides two front-end SQL query products.

BTEQ
BTEQ stands for Basic Teradata Query utility. It is an SQL front-end that
runs on all client platforms. It supports both interactive ad hoc and
scripted batch queries and provides standard report writing and formatting
functionality. It also provides a basic import/export capability. Other
utilities are available for these functions, particularly when larger tables
are involved. When a user logs on with BTEQ, they can set the session
number they need before logging on. Using multiple sessions may
improve performance for certain operations.

Teradata SQL Assistant
Teradata SQL Assistant (formerly called
Queryman) is an SQL front-end
for Teradata, as well as any other ODBC compliant database. It offers a
full menu of services including query history, timings, status, row counts,
random sampling, and limited answer sets. It provides an import/export
feature between database and PC and also allows export to Excel and
Access.

14. Teradata Data Dictionary

The Data Dictionary (DD)


The Data Dictionary (DD) is an integrated set of system tables that:
-
Stores database object definitions and accumulates information about
users
, databases, resource usage, data demographics, and
security rules
.

-
Records specifications about tables, views, and macros.

-
Contains information about ownership, space allocation,
accounting
, and access rights (privileges) for these objects.

Data Dictionary information is updated automatically during the processing
of Teradata SQL
data definition language (DDL) statements. It is used
by the Parser to obtain information needed to process all Teradata SQL
statements.

Users may access the Data Dictionary through Teradata-supplied views, if
permitted by the System Administrator.

15. Teradata SQL

Structured Query Language (SQL)

Structured Query Language (SQL) is the language used to access the
database. It is sometimes referred to as a Fourth Generation Language
(4GL) to differentiate it from Third Generation Languages such as
FORTRAN and COBOL, though it is quite different from other 4GLs. It
acts as an intermediary between the user and the database. SQL defines
the answer set that is returned from the database.
SQL is different from other computer languages. Its statements resemble
English-like structures. It provides powerful, set-oriented database
manipulation including
structural modification, data retrieval,
modification
, and security functions.

There are four important subsets of SQL: the Data Definition Language,
the Data Manipulation Language, Data Control Language and Stored
Procedure Language.

Data Definition Language (DDL)
DDL allows a user to define the
database objects and the relationships
that exist among them
. Examples of DDL uses are creating or modifying
tables and views.

Data Manipulation Language (DML)
DML consists of the
statements that manipulate, change, or retrieve
the data rows
of the database. If the DDL defines the database, the DML
lets the user change the information contained in the database. The DML
is the most commonly used subset of SQL. It is used to select, update,
delete, and insert rows.

Data Control Language (DCL)
DCL is used to
restrict or permit a user's access. It can selectively limit
a user's ability to retrieve, add, or modify data. It is used to grant and
revoke access privileges on tables and views.

Stored Procedure Language (SPL)
SPL is used to
create and call stored procedures in Teradata.

Teradata Extensions to SQL
User Assistance commands allow you to list the objects in a database or
the characteristics of a table, see how a query will execute, or show the
details of your system. They vary widely from vendor to vendor.

13. Teradata objects

Teradata Objects

A
database in the Teradata system is a collection of objects known as
tables
, views, macros, triggers and stored procedures. Databases
provide a
logical grouping for information. They are also the
foundation for space allocation and access control.

1) Tables
A table is the logical structure of data in an RDBMS. It is a
twodimensional
structure made up of columns and rows
. A user defines
a table by giving it a table name that refers to the type of data that will be
stored in the table (e.g., an Employee table stores data about employees.)

A
column represents attributes of the table. Column names are given to
each column of the table. All information in a column is of the same type.
For example, a column named Date of Birth would only hold date of birth
information. Each occurrence of an entity is stored in the table as a
row.
Entities are the people, places, things, or events that the table is
describing. Tables require Permanent Space to store rows.

2) Views
A view is a
pre-defined subset of one or more tables or other views. It
does not exist as a real table, but serves as a reference to existing tables
or views. One way to think of a view is as a virtual table. Views have
definitions in the Data Dictionary (DD) but do not contain any physical
rows. Views can be used to control access to the underlying tables.
Views can be used to hide columns from users, to insulate applications
from database changes, and to standardize or simplify access techniques.
Views do not require Permanent Space.

3) Macros
A macro is a predefined, stored set of one or more SQL commands and
report-formatting commands. Macros are used to simplify the execution of
frequently used SQL commands. Macros do not require Permanent
Space.

4) Triggers
A trigger is an event-driven procedure attached to a table. A trigger
defines events that happen when some other event, called a
triggering
event
, occurs. A trigger consists of one or more SQL statements that are
associated with a table and are executed when the trigger is fired. A
trigger is created with the CREATE TRIGGER statement.

5) Stored Procedures
A stored procedure is a pre-defined set of statements invoked through a
single SQL CALL statement. Stored procedures may contain both
Teradata SQL statements and procedural statements (in Teradata,
referred to as Stored Procedure Language, or SPL). Macros are similar to
stored procedures but do not contain SPL.

12. Teradata Client tools

Teradata Client Software

The Teradata Database requires three distinct pieces of software: TPA,
PDE, and OS.

An operating system (OS), UNIX or Windows 2000, and a Teradata
software license are necessary for each node.

A Trusted Parallel Application (TPA) implements virtual processors and
runs on the OS with PDE. The Teradata Databse is classified as a TPA.

The components of the Teradata Database software include:

·
Channel Driver
·
Teradata Gateway
·
AMP
·
PE

Parallel Database Extensions (PDE) were added to the operating system
to support the parallel software environment.

On UNIX, a virtual processor (vproc) is a collection of software processes
running under the multi-tasking environment of the UNIX operating
system.

On Windows 2000, a vproc is a software process that runs under the
multi-tasking environment of the Windows 2000 operating system.
The client may be a mainframe system, such as IBM or Amdahl, which is
channel-attached to the Teradata Database, or it may be a PC or UNIX -
based system that is LAN-attached.
The client application submits an SQL request to the RDBMS, receives

the response, and submits the response to the user.

11. What is AMP ?

The Access Module Processor (AMP)

The
Access Module Processor (AMP) is the virtual processor (vproc) in
Teradata's shared-nothing architecture that is responsible for managing a
portion of the database. An AMP will control some portion of each table
on the system. AMPs do the physical work associated with generating an
answer set, including sorting, aggregating, formatting and converting. The
AMPs perform all database management functions in the system.
An AMP responds to Parser/Optimizer steps transmitted across the
BYNET by selecting data from or storing data to its disks. For some
requests, the AMPs may redistribute a copy of the data to other AMPs.

The
Database Manager subsystem resides on each AMP. This
subsystem will:

·
Lock databases and tables.
·
Create, modify, or delete definitions of tables.
·
Insert, delete, or modify rows within the tables.
·
Retrieve information from definitions and tables.
·
Return responses to the Dispatcher.

 The database manager subsystem provides a bridge between that
logical organization and the physical organization of the data on disks.
The Database Manager performs a space-management function that
controls the use and allocation of space.
Teradata performs all tasks in parallel, providing exceptional performance.
The greater the number of tasks processed in parallel, the better the
system performance. Many databases call themselves "parallel", but they
can only perform some tasks in parallel.

10. What is BYNET ?

BYNET


The
BYNET handles the internal communication of the Teradata
Database. All communication between PEs and AMPs is done via the
BYNET.

When the PE dispatches the steps for the AMPs to perform, they are
dispatched onto the BYNET. The messages are routed to the appropriate
AMP(s) where results sets and status information are generated. This
response information is also routed back to the requesting PE via the
BYNET.
Depending on the nature of the dispatch request, the communication
between nodes may be a:

·
Broadcast—message is routed to all nodes in the system.
·
Point-to-point—message is routed to one specific node in the
system.

Once the message is on a participating node, PDE directs the message to
the appropriate AMPs on that node. All AMPs receive a broadcast
message. With a point-to-point or multicast (multiple AMPs) message, the
message is directed only to the appropriate AMP(s) on that node.
So, while a Teradata system does do multicast messaging, the BYNET
hardware alone cannot do it - the BYNET can only do point-to-point and
broadcast between nodes.
The BYNET has several unique features:

Fault tolerant:
each network has multiple connection paths. If the BYNET
detects an unusable path in either network, it will automatically reconfigure
that network so all messages avoid the unusable path. Additionally, in the
rare case that BYNET 0 cannot be reconfigured, hardware on BYNET 0 is
disabled and messages are re-routed to BYNET 1, and vice versa.

Load balanced:
traffic is automatically and dynamically distributed
between both BYNETs.

Scalable:
as you add nodes to the system, overall network bandwidth
scales linearly - meaning an increase in system size without loss of
performance.

High Performance:
an MPP system has two BYNET networks. Because
both networks are active, the system benefits from the full aggregate
bandwidth.

The technology of the BYNET is what makes the Teradata parallelism
possible.

9.Parsing Engine

The Parsing Engine

A
Parsing Engine (PE) is a virtual processor (vproc). It is made up of the
following software components: Session Control, the Parser, the
Optimizer, and the Dispatcher.
Once a valid session has been established, the PE is the component that
manages the dialogue between the client application and the RDBMS.
Each PE can handle up to 120 sessions, and each session can handle
multiple requests.

Session Control
The major functions performed by Session Control are logon and logoff.
Logon takes a textual request for session authorization, verifies it, and
returns a yes or no answer. Logoff terminates any ongoing activity and
deletes the session’s context.

Parser
The Parser interprets SQL statements, checks them for proper SQL
syntax and evaluates them semantically. The PE also consults the Data
Dictionary to ensure that all objects and columns exist and that the user
has authority to access these objects.

Optimizer
The
Optimizer is responsible for developing the least expensive plan to
return the requested response set. Processing alternatives are evaluated
and the fastest plan is chosen. This plan is converted to executable steps,
to be performed by the AMPs, which are then passed to the dispatcher. In
order to maximize throughput and minimize resource contention, the
optimizer must know about
system configuration, available units of
parallelism
(AMPs and PE's), and data demographics. The Teradata
Optimizer is robust and intelligent. The optimizer enables Teradata to
handle multiple complex, ad hoc queries efficiently. It is
parallel-aware
and
cost-based and uses full look-ahead capability.

Dispatcher
The Dispatcher controls the sequence in which the steps are executed
and passes the steps on to the BYNET. It is composed of executioncontrol
and response-control tasks. Execution control receives the step
definitions from the Parser and transmits them to the appropriate AMP(s)
for processing, receives status reports from the AMPs as they process the
steps, and passes the results on to response control once the AMPs have
completed processing. Response control returns the results to the user.
The Dispatcher sees that all AMPs have finished a step before the next
step is dispatched. Depending on the nature of the SQL request, a step
will be sent to one AMP, or broadcast to all AMPs.