MA3
From CLAB
"The Microarray Project"
Contents |
Overview
- Lead developer: Sajan Singh Suwal
- MA3 development instance (also "production" for now).
- Development is occurring in different miscellaneous
*2.phpfiles and these tables:matrixinfo_test2, organisminfo_test2, probeinfo_test2
- Development is occurring in different miscellaneous
- Currently open tickets in RT.
- Source code: MA3 has it's own SVN repository.
- MA3 Manual
PostgreSQL database
Tables
- matrixinfo - 9.8M rows
- One row is the brightness stats for a single X Y in a single experiment
- record_id should be experiment_id
- each row is small. That's good.
- outliner should be outlier
- organisminfo - 27 rows
- Actually 1 row per experiment, so I would have called this the experiment table.
- record_id should be experiment_id
- probeinfo - 123K rows
- target sequences badly need to be normalized out of here
- probe_id should be probeset_id
- record_id should be organism_id
psql -U schandio -d microarraydata \l # List all databases \d # List all objects in your current database \d matrixinfo # Show the schema of the matrixinfo table # Dump a schema. Takes 6 minutes! I have no idea why for 3 tables. pg_dump -s -U schandio -W microarraydata
Maintenance
Routine Data Maintenance Tasks contains many recommendations. I'm running vacuum verbose analyze; now. I have no idea if anyone has ever done this. --Jhannah 11:13, 22 July 2007 (CDT)
- That seems to have helped big time...? With using indexed scans and with response times to queries? --Jhannah 11:59, 2 August 2007 (CDT)
Query Tuning
explain [analyze] select * from matrixinfo;
Software
- .fla files are Macromedia Flash source code.
- .swf are the compiled Flash objects.
- experimentList.php is the normal post
- experimentListA.php is when you select all probes. It runs in the background and send its results to a file instead of the screen.
Misc SVN commands
svn checkout svn://klab.ist.unomaha.edu/KLAB/projects/transcriptomics/MArray MArray svn stat svn info svn add svn diff svn commit
Progress on MA3 Project
RT103 The order of the probes in both the flash and the chart display were put in order of the start position of the probe and all the associated data correspond to each other on the display.
RT113 Sept 10 - Server is being moved from kiran.homelinex.net to biobase.ist.unomaha.edu server which has much higher bandwidth available. new location: http://biobase.ist.unomaha.edu/KLAB/projects/transcriptomics/MArray/ tasks: -- back up current microarray database from kiran.homelinux.net -- download the file to local machine. -- upload to biobase -- install psql package for php in biobase -- setup database and dump the file to the database -- setup users and adjust permissions to the database -- copy the source files to biobase -- added the source files to the new svn server.
RT15 purpose: enancement to the MA3 tool. Capability to upload .cel files generated from the lab and automate the process of adding its data to the MA3 database for multiple organisms. Update: Developing a model on how the entire process is going to work. (.cel files are files containing data form an experiment for a given organism.) 1) the .cel (version 4) needs to be converted to .cel (version 3) using a windows program. the version 3 .cel file is plain text (sample) and version 4 .cell file is binary (sample). 2) the converted file is then parsed by a .pl script and the data is moved to the psql database. 3) Since the cel file converter is a windows binary application which requires user interaction, we are developing a windows application in C# which automates the interaction when a new cel file that needs to be converted is detected. 4)The dev upload site at: http://biobase.ist.unomaha.edu/KLAB/projects/transcriptomics/MArray/uploader/ 5)The CelFileConversionEngine for the Windows server is (CelFileConversionEngine.exe)
- can you please replace 'you will be redirected to status page in 7s' with the followoing: Please do not exit, you will be automatically redirected to the the "Status Page"
6) New updates : ----> The windows system and the linux system are connecting via ftp ----> The complete system has been successfully tested. ----> The complete cel file upload system takes 25-30 mins to complete. ----> If new organism is detected, the organism info update process takes . about 10-20 mins. ----> Made modifications to the query page to be able to select organisms to query on. ----> Created new table organisms which keeps track of the organisms and its record_id. 7) The new organism handler is complete - takes 3 files for the organism to parse into the ---probeinfo table. ---sample files :s_aureus_probe_tab(Probe table file) , s_aureus_target (Target Seq file), s_aureus_annot.csv(Probeinfo file)
Flow diagram for the cel file upload/conversion/update System
Entity Relation diagram of the microarray query system
Note: Currently the tables being used are matrixinfo_test2, organisminfo_test2, probeinfo_test2. (for testing and debugging purposes).



