Using SQL within CLSD
This page is presents some examples of using SQL within CLSD. For a very brief introduction to SQL, see SQL and the relational algebra. For more detailed information about SQL, see these online tutorials:
For detailed information about using SQL to build queries for dbSNP data see Using dbSNP via SQL queries.
Note that within DB2, CLSD appears as a collection of "schemas", each of which includes a collection of "tables", and that many of the tables are actually "virtual". That is, they are not physically present on local hardware, but are accessed over the network as needed.
Getting general information about the CLSD database
- To get a list of all tables available within the CLSD instance of DB2:
select tabname from syscat.tables
- To get all the information within syscat.tables about each table:
select * from syscat.tables
- To find other information about the CLSD instance, consider this
material taken from the Selector
web site:
There are 3 default tablespaces (TABLESPACES):
- SYSCATSPACE
- TEMPSPACE
- USERSPACE1
Example:
select * from SYSCAT.TABLESPACES
There are 4 schemas within SYSCATSPACE
- SYSIBM - system tables (read only)
- SYSCAT - views of system tables (read only)
- SYSFUN - stored functions
- SYSSTAT - system use statistics
Here is a table of some useful tables. Note that for each entry in the SYSIBM column there is a corresponding entry in the SYSCAT column.
Database Object SYSIBM system table SYSCAT system table view SYSSTAT updateable view Tablespace SYSTABLESPACES TABLESPACES Table SYSTABLES TABLES TABLES Schema SYSSCHEMATA SCHEMATA View SYSVIEWS VIEWS Column SYSCOLUMNS COLUMNS COLUMNS Examples from SYSIBM system table area:
select * from SYSIBM.TABLES select * from SYSIBM.SYSTABLESand examples from the SYSCAT area:select * from SYSCAT.SCHEMATA select * from SYSCAT.TABLES select typename from SYSCAT.DATATYPES - To get a list of all tables or tablenames in schema dbSNP126_SHARED:
select * from syscat.tables where tabschema like '%DBSNP126%' select tabname from syscat.tables where tabschema='DBSNP126_SHARED'
Note that "like" operands are case-sensitive. - To get a list of rowcounts for all tables in schema dbSNP125_human:
select * from dbsnp125_human.DN_TABLE_ROWCOUNT
- To get a description of each column in each table within schema
dbSNP126_shared:
select * from syscat.tables where tabschema = 'DBSNP126_SHARED'
Getting specific information from the database
This section shows some queries that display table contents rather than table "metadata".
- To get all columns of the alleles table of schema dbSNP126_shared:
select * from dbsnp126_shared.allele
(Note that this contain more rows than allowed by the default maximum row count of 5000 employed by the reference web interface. If you want to see all the entries, you will have to adjust the maximum row count parameter on the web-based form.)On the other hand, to limit the number of rows returned by the SQL command itself, you can use:
select * from dbsnp126_shared.allele fetch first 100 rows only
- To count the number of entries in the allele database:
select count(*) from dbSNP126_shared.allele
- To see the entire HapMap classification code table (9 entries):
select * from dbsnp126_human.hapmapclasscode
- To get a list of just the codes and their abbreviations:
select code,abbrev from dbsnp126_human.hapmapclasscode
- To run a a BLASTN_NT search:
select GB_ACC_NUM,DESCRIPTION from NCBI.BLASTN_NT where BLASTSEQ='AGTACTAGCTAGCTAGCTACTAGCTGACTGACTGACTGATGCATCGATGATGC'
Note that the string "BLASTN" includes a trailing "N", and that query nucleotide and AA sequences must be at least 15 bytes long.
See Using BLAST via CLSD for more information.
- To get a list of BLAST results filtered by a Prosite pattern:
select gb_acc_num,HSP_H_SEQ from ncbi.blastp_nr where blastseq='MSQICKRGLLISNRLAPAALRCKSTWFSEVQMGPPDAILGVTEAFKKDTNPKKI \ NLGAGAYRDDNTQPFVLPSVREAEKRVVSRSLDKEYATIIGIPEFYNKAIELALGKGSKRLAA \ KHNVTAQSISGTGALRIGAAFLAKFWQGNREIYIPSPSWGNHVAIFEHAGLPVNRYRYYDKDT' and DB2LS.LSPatternMatch(HSP_H_SEQ, db2ls.LSPrositePattern('[GS]-[LIVMFYTAC]-[GSTA]-K-x(2)-[GSALVN].') ) > 0 Note the use of the period (.) to terminate the Prosite pattern, and that the LSPatternMatch function returns the character position of the left-most substring matching the pattern, or zero if there is no match. - To get a list of genes containing "brain" in their LOCUS_NAME in
dbSNP126_shared:
select * from DBSNP126_SHARED.GENEIDTONAME where locus_name like '%brain%'
Note that "brain" is case specific. - To get a list of Bind Genes and their species:
select GeneNameA,Organism from bind.bind_interaction
- To get a list of genes mentioning "HUMAN" in their descriptions in KEGG:
select * from KEGG.GENE where description like '%HUMAN%'
Note that "like" operands are case-sensitive. - To get a list of all columns in the "REAC%" tables:
select * from syscat.columns where TABNAME like 'REAC%'
- To get a list of each reaction (showing reactants, enzymes and
products from the Ligand database) where the enzyme codes have been
replaced with the enzyme names, as held in the Enzyme database, for
each pathway recorded in the Ligand database:
select pw.PATH_ID, pw.REAC_ID, re.COMP_ID, sx.DESCRIPTION, pr.COMP_ID from Ligand.PATHREAC pw, Ligand.REACREAC re, Ligand.REACENZY en, Ligand.REACPROD pr, Enzyme.general sx where en.REAC_ID=re.REAC_ID AND en.REAC_ID=pw.REAC_ID AND en.REAC_ID=pr.REAC_ID AND sx.EC_ID=en.EC_ID order by pw.PATH_ID,pw.REAC_ID
- To get some info from PubMed:
select PMID, ArticleTitle FROM NCBI.pmarticles where entrez.contains (ArticleTitle, 'granulation') = 1 AND entrez.contains (PubDate, '1992') = 1
- To get a list of sequences involving JAK3 from NCBI Nucleotide:
SELECT PRIMARYACCESSION, SEQLENGTH, DEFINITION, SEQUENCE FROM NCBI.GBSEQ WHERE ( ENTREZ.CONTAINS( ORGANISM, 'HUMAN') = 1 ) AND ( ENTREZ.CONTAINS( DEFINITION, 'JAK3' ) = 1 )
Additional information
If you have received SQL error messages from this script, please consult this IBM Wanrning, error and completion messages information.
This index into the IBM help files might also be helpful.
If you have questions about this page, please contact Michael Grobe at dgrobe @ iupui.edu.



