Machine Learning - Using algorithms and computation to generalize from data
Life Sciences - Anything that wiggles from 20nm to 30m in length
Most of the subjects I will touch on are incredibly deep and worthy of their own talk. Thankfully, the Research Triangle Analysts have already given some of them.
Fortunately, it requires near willful ignorance to acquire hacking skills and substantive expertise without also learning some math and statistics along the way. As such, the danger zone is sparsely populated, however, it does not take many to produce a lot of damage. - Drew Conway
The emphasis here is on finding a common understanding of the vocabulary between life scientists and analysts; things like pipelines, dataframes and representations.
from rdkit import Chem from rdkit.Chem import Draw %matplotlib inline
m3 = Chem.MolFromSmiles('O=C1OC2=C(C=C1)C1=C(C=CCO1)C=C2') fig3 = Draw.MolToMPL(m3)
smiles = ("O=C(NCc1cc(OC)c(O)cc1)CCCC/C=C/C(C)C", "CC(C)CCCCCC(=O)NCC1=CC(=C(C=C1)O)OC", "c1(C(=O)O)cc(OC)c(O)cc1") mols = [Chem.MolFromSmiles(x) for x in smiles] Draw.MolsToGridImage(mols)
Dealing with 30X genome sized datasets initially
Comparing RNA expression levels takes this from a big data problem back to another simple classification problem
Picture of simple net
Picture of architecture
Example of python code with Theano
New opportunities come from tying together multiple models.
For the hackers
For the employed
For the enthusiast
!jupyter nbconvert --to slides MLforLS.ipynb --post serve
[NbConvertApp] Converting notebook MLforLS.ipynb to slides [NbConvertApp] Writing 202636 bytes to MLforLS.slides.html [NbConvertApp] Redirecting reveal.js requests to https://cdn.jsdelivr.net/reveal.js/2.6.2 Serving your slides at http://127.0.0.1:8000/MLforLS.slides.html Use Control-C to stop this server Created new window in existing browser session. WARNING:tornado.access:404 GET /custom.css (127.0.0.1) 0.79ms WARNING:tornado.access:404 GET /favicon.ico (127.0.0.1) 0.47ms