kfetchbatch

Function

Fetch a set of entries in a defined format and style

Description

kfetchbatch fetches a set of entries in a defined format and style using dbfetch service. dbfetch allows you to retrieve entryies from various up-to-date biological databases using entry identifires or accession numbers.

kfetchbatch utilizes fetchbatch method included dbfetch SOAP service provided
EBI (European Bioinformatics Institute). Original web-service is located at the following URL:

http://www.ebi.ac.uk/cgi-bin/dbfetch

This tools is a subset of Keio Bioinformatics Web Service (KBWS) package, which contains interfaces to bioinformatics web services through a proxy server at Keio University. kfetchbatch is thus an interface of "runFetchBatch" method included in KBWS SOAP service.
This method can be alternatively accessed directly from programming languages as SOAP web service. Please refer to the KBWS online documentations http://www.g-language.org/kbws/ for more information.

Usage


  Here is a sample session with kfetchbatch

% kfetchbatch
Fetch a set of entries in a defined format and style
dbname (e.g. uniprotkb): uniprotkb
id list (e.g. 1433T_RAT,WAP_RAT):  WAP_MOUSE,WAP_RAT
Output file [outfile.kfetchbatch]: stdout


ID   WAP_MOUSE               Reviewed;         134 AA.
AC   P01173; P70230; Q61023;
DT   21-JUL-1986, integrated into UniProtKB/Swiss-Prot.
DT   04-MAY-2001, sequence version 3.
DT   24-NOV-2009, entry version 86.
DE   RecName: Full=Whey acidic protein;
DE            Short=WAP;
DE   Flags: Precursor;
GN   Name=Wap;
OS   Mus musculus (Mouse).
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC   Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Sciurognathi;
OC   Muroidea; Muridae; Murinae; Mus.
OX   NCBI_TaxID=10090;
RN   [1]
RP   NUCLEOTIDE SEQUENCE [MRNA].
RX   MEDLINE=82196900; PubMed=6896234; DOI=10.1093/nar/10.8.2677;
RA   Hennighausen L.G., Sippel A.E.;
RT   "Mouse whey acidic protein is a novel member of the family of 'four-
RT   disulfide core' proteins.";
RL   Nucleic Acids Res. 10:2677-2684(1982).
RN   [2]
RP   NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RX   MEDLINE=85062841; PubMed=6095207; DOI=10.1093/nar/12.22.8685;
RA   Campbell S.M., Rosen J.M., Hennighausen L.G., Strech-Jurk U.,
RA   Sippel A.E.;
RT   "Comparison of the whey acidic protein genes of the rat and mouse.";
RL   Nucleic Acids Res. 12:8685-8697(1984).
RN   [3]
RP   NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RC   STRAIN=GR;
RA   Hennighausen L.;
RL   Submitted (OCT-1995) to the EMBL/GenBank/DDBJ databases.
RN   [4]
RP   CHARACTERIZATION.
RC   STRAIN=YBR;
RX   MEDLINE=82052974; PubMed=6975276;
RA   Piletz J.E., Heinlen M., Ganschow R.E.;
RT   "Biochemical characterization of a novel whey protein from murine
RT   milk.";
RL   J. Biol. Chem. 256:11509-11516(1981).
CC   -!- FUNCTION: Could be a protease inhibitor. May play an important
CC       role in mammary gland development and tissue remodeling.
CC   -!- SUBCELLULAR LOCATION: Secreted.
CC   -!- TISSUE SPECIFICITY: Milk-specific; major protein component of milk
CC       whey.
CC   -!- PTM: No phosphate or carbohydrate binding could be detected;
CC       however, both cholesterol and triglyceride are associated with the
CC       mouse protein.
CC   -!- SIMILARITY: Contains 2 WAP domains.
CC   -----------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution-NoDerivs License
CC   -----------------------------------------------------------------------
DR   EMBL; V00856; CAA24224.1; -; mRNA.
DR   EMBL; X01157; CAA25604.1; -; Genomic_DNA.
DR   EMBL; X01158; CAA25604.1; JOINED; Genomic_DNA.
DR   EMBL; X01159; CAA25604.1; JOINED; Genomic_DNA.
DR   EMBL; X01160; CAA25604.1; JOINED; Genomic_DNA.
DR   EMBL; U38816; AAA91321.1; -; Genomic_DNA.
DR   IPI; IPI00323750; -.
DR   PIR; A93423; WYMS.
DR   UniGene; Mm.268094; -.
DR   STRING; P01173; -.
DR   PRIDE; P01173; -.
DR   Ensembl; ENSMUST00000102910; ENSMUSP00000099974; ENSMUSG00000000381; Mus musculus.
DR   MGI; MGI:98943; Wap.
DR   HOVERGEN; P01173; -.
DR   ArrayExpress; P01173; -.
DR   Bgee; P01173; -.
DR   CleanEx; MM_WAP; -.
DR   Genevestigator; P01173; -.
DR   GermOnline; ENSMUSG00000000381; Mus musculus.
DR   GO; GO:0005576; C:extracellular region; IEA:UniProtKB-SubCell.
DR   GO; GO:0030414; F:peptidase inhibitor activity; IEA:UniProtKB-KW.
DR   InterPro; IPR008197; Whey_acidic_protein_4-diS_core.
DR   Pfam; PF00095; WAP; 2.
DR   SMART; SM00217; WAP; 1.
DR   PROSITE; PS51390; WAP; 2.
PE   1: Evidence at protein level;
KW   Disulfide bond; Milk protein; Protease inhibitor; Repeat; Secreted;
KW   Signal.
FT   SIGNAL        1     19       By similarity.
FT   CHAIN        20    134       Whey acidic protein.
FT                                /FTId=PRO_0000041350.
FT   DOMAIN       27     73       WAP 1; atypical.
FT   DOMAIN       76    128       WAP 2.
FT   DISULFID     45     65       By similarity.
FT   DISULFID     48     60       By similarity.
FT   DISULFID     83    116       By similarity.
FT   DISULFID     97    120       By similarity.
FT   DISULFID    103    115       By similarity.
FT   DISULFID    109    124       By similarity.
FT   CONFLICT      2      2       R -> S (in Ref. 2; CAA25604).
FT   CONFLICT     11     11       L -> R (in Ref. 3; AAA91321).
FT   CONFLICT     35     35       Q -> P (in Ref. 1; CAA24224).
FT   CONFLICT     63     63       G -> R (in Ref. 1; CAA24224).
FT   CONFLICT     63     63       G -> V (in Ref. 2; CAA25604).
FT   CONFLICT     87     87       L -> S (in Ref. 2; CAA25604).
FT   CONFLICT    100    100       K -> Q (in Ref. 1; CAA24224).
SQ   SEQUENCE   134 AA;  14423 MW;  C12B2544877ECA80 CRC64;
MRCLISLVLG LLALEVALAQ NLEEQVFNSV QSMFQKASPI EGTECIICQT NEECAQNAMC
CPGSCGRTRK TPVNIGVPKA GFCPWNLLQT ISSTGPCPMK IECSSDRECS GNMKCCNVDC
VMTCTPPVPV ITLQ
//
ID   WAP_RAT                 Reviewed;         137 AA.
AC   P01174;
DT   21-JUL-1986, integrated into UniProtKB/Swiss-Prot.
DT   01-OCT-1989, sequence version 2.
DT   24-NOV-2009, entry version 79.
DE   RecName: Full=Whey acidic protein;
DE   AltName: Full=Whey phosphoprotein;
DE   AltName: Full=WAP;
DE   Flags: Precursor;
GN   Name=Wap;
OS   Rattus norvegicus (Rat).
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC   Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Sciurognathi;
OC   Muroidea; Muridae; Murinae; Rattus.
OX   NCBI_TaxID=10116;
RN   [1]
RP   NUCLEOTIDE SEQUENCE [GENOMIC DNA].
RX   MEDLINE=85062841; PubMed=6095207; DOI=10.1093/nar/12.22.8685;
RA   Campbell S.M., Rosen J.M., Hennighausen L.G., Strech-Jurk U.,
RA   Sippel A.E.;
RT   "Comparison of the whey acidic protein genes of the rat and mouse.";
RL   Nucleic Acids Res. 12:8685-8697(1984).
RN   [2]
RP   NUCLEOTIDE SEQUENCE [MRNA].
RX   MEDLINE=82274212; PubMed=6896749; DOI=10.1093/nar/10.12.3733;
RA   Hennighausen L.G., Sippel A.E.;
RT   "Comparative sequence analysis of the mRNAs coding for mouse and rat
RT   whey protein.";
RL   Nucleic Acids Res. 10:3733-3744(1982).
RN   [3]
RP   NUCLEOTIDE SEQUENCE [MRNA], PROTEIN SEQUENCE OF 20-41, AND DISULFIDE
RP   BONDS.
RX   MEDLINE=82275050; PubMed=6955785; DOI=10.1073/pnas.79.13.3987;
RA   Dandekar A.M., Robinson E.A., Appella E., Qasba P.K.;
RT   "Complete sequence analysis of cDNA clones encoding rat whey
RT   phosphoprotein: homology to a protease inhibitor.";
RL   Proc. Natl. Acad. Sci. U.S.A. 79:3987-3991(1982).
CC   -!- FUNCTION: Could be a protease inhibitor.
CC   -!- SUBCELLULAR LOCATION: Secreted.
CC   -!- TISSUE SPECIFICITY: Milk-specific; major protein component of milk
CC       whey.
CC   -!- PTM: Contains 8 disulfide bonds.
CC   -!- SIMILARITY: Contains 2 WAP domains.
CC   -----------------------------------------------------------------------
CC   Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC   Distributed under the Creative Commons Attribution-NoDerivs License
CC   -----------------------------------------------------------------------
DR   EMBL; X01153; CAA25600.2; -; Genomic_DNA.
DR   EMBL; X01154; CAA25600.2; JOINED; Genomic_DNA.
DR   EMBL; X01155; CAA25600.2; JOINED; Genomic_DNA.
DR   EMBL; X01156; CAA25600.2; JOINED; Genomic_DNA.
DR   EMBL; J00802; AAA42347.1; -; mRNA.
DR   EMBL; J00801; AAA42346.1; -; mRNA.
DR   IPI; IPI00210939; -.
DR   PIR; A93920; WYRT.
DR   RefSeq; NP_446203.1; -.
DR   UniGene; Rn.9978; -.
DR   STRING; P01174; -.
DR   Ensembl; ENSRNOT00000010744; ENSRNOP00000010744; ENSRNOG00000008097; Rattus norvegicus.
DR   GeneID; 114596; -.
DR   KEGG; rno:114596; -.
DR   UCSC; NM_053751; rat.
DR   CTD; 114596; -.
DR   RGD; 621851; Wap.
DR   HOVERGEN; P01174; -.
DR   NextBio; 618763; -.
DR   ArrayExpress; P01174; -.
DR   Genevestigator; P01174; -.
DR   GermOnline; ENSRNOG00000008097; Rattus norvegicus.
DR   GO; GO:0005576; C:extracellular region; IEA:UniProtKB-SubCell.
DR   GO; GO:0030414; F:peptidase inhibitor activity; NAS:RGD.
DR   InterPro; IPR015874; 4-disulphide_core.
DR   InterPro; IPR008197; Whey_acidic_protein_4-diS_core.
DR   Gene3D; G3DSA:4.10.75.10; Whey_acidic_protein_4-diS_core; 1.
DR   Pfam; PF00095; WAP; 2.
DR   PRINTS; PR00003; 4DISULPHCORE.
DR   SMART; SM00217; WAP; 1.
DR   PROSITE; PS51390; WAP; 2.
PE   1: Evidence at protein level;
KW   Direct protein sequencing; Disulfide bond; Milk protein;
KW   Protease inhibitor; Repeat; Secreted; Signal.
FT   SIGNAL        1     19
FT   CHAIN        20    137       Whey acidic protein.
FT                                /FTId=PRO_0000041353.
FT   DOMAIN       27     73       WAP 1.
FT   DOMAIN       76    127       WAP 2.
FT   DISULFID     34     61       Probable.
FT   DISULFID     45     65       Probable.
FT   DISULFID     48     60       Probable.
FT   DISULFID     54     69       Probable.
FT   DISULFID     83    115       Probable.
FT   DISULFID     96    119       Probable.
FT   DISULFID    102    114       Probable.
FT   DISULFID    108    123       Probable.
FT   CONFLICT      4      4       S -> F (in Ref. 3; AAA42347).
FT   CONFLICT     35     35       S -> P (in Ref. 2 and 3).
FT   CONFLICT     39     39       F -> S (in Ref. 2 and 3).
FT   CONFLICT     47     47       N -> K (in Ref. 2; AAA42346).
FT   CONFLICT     68     68       S -> P (in Ref. 2 and 3).
FT   CONFLICT     99     99       D -> G (in Ref. 2; AAA42346).
FT   CONFLICT    116    116       K -> N (in Ref. 2 and 3).
FT   CONFLICT    127    127       E -> K (in Ref. 2).
FT   CONFLICT    129    129       K -> D (in Ref. 2).
FT   CONFLICT    129    129       K -> E (in Ref. 3; AAA42347).
FT   CONFLICT    134    134       I -> V (in Ref. 3; AAA42347).
SQ   SEQUENCE   137 AA;  14827 MW;  1C2E8ADA9FD97949 CRC64;
MRCSISLVLG LLALEVALAR NLQEHVFNSV QSMCSDDSFS EDTECINCQT NEECAQNDMC
CPSSCGRSCK TPVNIEVQKA GRCPWNPIQM IAAGPCPKDN PCSIDSDCSG TMKCCKNGCI
MSCMDPEPKS PTVISFQ
//

	  

Command line arguments

Qualifier Type Description Allowed values Default
Standard (Mandatory) qualifiers
[-dbname]
(Parameter 1)
string Database Name (e.g. uniprotkb) Any string Required
[-idlist]
(Parameter 2)
string A string containing a comma separated list of entry identifiers (e.g. 1427T_RAT,WAP_RAT) Any string Required
[-outfile]
(Parameter 3)
outfile Output file name Output file <*>.kfetchbatch
Additional (Optional) qualifiers
-format string Output data format. 'default', 'annot', 'embl', 'emblxml-1.1', emblxml-1.2', 'fasta', 'insdxml', 'seqxml' or 'entrysize' Any string default
-style string Output data style. 'raw' or 'html' Any string raw
Advanced (Unprompted) qualifiers
(none)

Input file format

kfetchbatch can use any genome sequence data that seqret can read.

Output file format

The output of kfetchbatch is strings containing the entry.
Generally this will contain only one item which contains the whole entry.

Data files

None.

Notes

None.

References

      Labarga, A., et al.,(2007) Web Services at the European Bioinformatics
      Institute, Nucleic Acids Research, 35, W6-W11.
    

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with a status of 0.

Known bugs

None.

See also

Program name Description
kfetchdata Fetch an entry in a defined format and style
seqret Reads and writes (returns) sequences

Author(s)

Kazuki Oshita (cory @ g-language.org)
Institute for Advanced Biosciences, Keio University
252-8520 Japan

History

2009 - Written by Kazuki Oshita
Mar 2013 - Update document by Kazuki Oshita

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments

None.