y a n    k a r k l i n

post-doc
laboratory for computational vision
center for neural science
new york university



publications   /   code    

 
  home : publications : code

journal articles

Y. Karklin and M. S. Lewicki, Emergence of complex cell properties by learning to generalize in natural scenes, Nature, 2009. [pdf+supp] [abstract] [bibtex] [html]

Y. Karklin and M. S. Lewicki, A hierarchical Bayesian model for learning non-linear statistical regularities in non-stationary natural signals, Neural Computation, 2005. [pdf] [abstract] [bibtex]

Y. Karklin and M. S. Lewicki, Learning higher-order structures in natural images, Network: Computation in Neural Systems, 2003. [pdf] [abstract] [bibtex]

conference papers + abstracts

Y. Karklin, C. Ekanadham, and E. P. Simoncelli, Hierarchical spike coding of sound, Adv in Neural Information Processing Systems (NIPS), 2012. [pdf] [abstract] [bibtex]

Y. Karklin and E. P. Simoncelli, Efficient coding of natural images and movies with populations of noisy nonlinear neurons, Computational and Systems Neuronscience (CoSyNe), 2012. [abstract] [bibtex]

Y. Karklin and E. P. Simoncelli, Efficient coding of natural images with a population of noisy linear-nonlinear neurons, Adv in Neural Information Processing Systems (NIPS), 2011. [pdf] [abstract] [bibtex]

Y. Karklin and E. P. Simoncelli, Optimal information transfer in a noisy nonlinear neuron, Computational and Systems Neuroscience (CoSyNe), 2011. [poster pdf]

Y. Karklin and M. S. Lewicki, Is early vision optimized for extracting higher-order dependencies?, Adv in Neural Information Processing Systems (NIPS), 2006. [pdf] [abstract] [bibtex]

Y. Karklin and M. S. Lewicki, A model for learning variance components of natural images, Adv in Neural Information Processing Systems (NIPS), 2003. [pdf] [abstract] [bibtex]

phd thesis

Y. Karklin. Hierarchical statistical models of computation in the visual cortex. PhD thesis, CMU-CS-07-159. Carnegie Mellon University, 2007. [pdf]

other projects

Y. Karklin, R. F. Meraz, S. R. Holbrook, Classification of non-coding RNA using graph representations of secondary structure, Proceedings of the Pacific Symposium on Biocomputing, 2005. [pdf] [abstract] [bibtex] [code]
Some genes produce transcripts that function directly in regulatory, catalytic, or structural roles in the cell. These non-coding RNAs are prevalent in all living organisms, and methods that aid the understanding of their functional roles are essential. RNA secondary structure, the pattern of base-pairing, contains the critical information for determining the three dimensional structure and function of the molecule. In this work we examine whether the basic geometric and topological properties of secondary structure are sufficient to distinguish between RNA families in a learning framework. First, we develop a labeled dual graph representation of RNA secondary structure by adding biologically meaningful labels to the dual graphs proposed by Gan et al [1]. Next, we define a similarity measure directly on the labeled dual graphs using the recently developed marginalized kernels [2]. Using this similarity measure, we were able to train Support Vector Machine classifiers to distinguish RNAs of known families from random RNAs with similar statistics. For 22 of the 25 families tested, the classifier achieved better than 70% accuracy, with much higher accuracy rates for some families. Training a set of classifiers to automatically assign family labels to RNAs using a one vs. all multi-class scheme also yielded encouraging results. From these initial learning experiments, we suggest that the labeled dual graph representation, together with kernel machine methods, has potential for use in automated analysis and classification of uncharacterized RNA molecules or efficient genome-wide screens for RNA molecules from existing families.
@InProceedings{Karklin-Meraz-Holbrook-PSB05,
author = "Karklin, Y. and Meraz, R. F. and Holbrook, S. R.",
title = "Classification of non-coding RNA using graph representations of secondary structure",
booktitle = "Pacific Symposium on Biocomputing 10",
pages = "4-15",
year = {2005},
}




 
  home : publications : code
[ all code made available under the GNU Public License ]

covariance component model

matlab code for maximum likelihood estimation of the model described in
Y. Karklin and M. S. Lewicki, Emergence of complex cell properties by learning to generalize in natural scenes, Nature, 2008. [link]
Matlab code that implements the learning and inference updates on a toy dataset: cov_model_toy.zip.

variance component model

matlab code for maximum likelihood estimation of the model described in
Y. Karklin and M. S. Lewicki, A hierarchical Bayesian model for learning non-linear statistical regularities in non-stationary natural signals, Neural Computation, 17 (2): 397-423, 2005. [pdf]

download VarianceComponents.m.

(this code implements learning on a synthetic dataset and recovers a set of known components)

rna graph classifier

matlab code for classifying non-coding RNAs based on their secondary structure, as described in
Y. Karklin, R. F. Meraz, S. R. Holbrook. Classification of non-coding RNA using graph representations of secondary structure, Proceedings of the Pacific Symposium on Biocomputing 10:4-15, 2005. [pdf]

download the main package, supplementary files (some more functions for plotting, doing multi-class classification, other utils; highly undocumented!), and a short FAQ.

the SVM part of the code was adapted from LS-SVMlab, though ultimately SVM training is done with simple matlab matrix division.

in order to duplicate results from the paper, you will also need RNAs from RFAM and folding software from ViennaRNA.

matlash

a nascent project to convert matlab figures into flash objects.

for example, you can type in matlab,
>> x = linspace(0,2*pi,50);
>> plot(x,cos(x),'r'); hold on; plot(x,sin(2*x),'b');
>> matlash
and get a flash object to embed in a web page:
download matlash.m.

here's how it works:
  • matlash.m looks at the current figure in matlab and extracts line plot parameters
  • it writes a haXe program
  • haXe compiles to a flash .swf object
  • most functions are not yet implemented. it will work for a basic line plot like the one above, but that's about it
why do this? once more developed, it will let you
  • browse data and graphs interactively (e.g. project different aspects of the data, add interactive data labels)
  • dynamically generate figures
  • plot data online in a clean, vector graphics format
  • present results in a concise, organized way
Statcounter