9,645 research outputs found

    The allocation of time to crime: A simple diagrammatical exposition

    Get PDF
    In his seminal article on the allocation of time to crime, Isaac Ehrlich (1973) derives five interesting theoretical results. He uses a state-preference diagram to derive one result, retreating to mathematics for deriving the remaining four results. This note shows that all five results can easily be derived from an alternative and simpler diagrammatical exposition that involves intersection of curves rather than tangency between curves.

    Auditing ghosts by prosperity signals

    Get PDF
    Ghosts are economic agents who evade taxes by failing to file a return. Knowing nothing about them, the tax agency is unable to track them down through audit strategies which are based on reported income. The present paper develops a simple model of the audit decision for a ghost-busting tax agency which bases its audit strategy on signals of prosperous living, such as ownership of high-quality housing. Ghosts have a preference for high-quality housing, but may opt to own houses of a lower quality so as to escape detection. The paper compares the optimal audit rules and net tax collection under signal and blind auditing of the non-declaring population, deriving conditions under which each strategy will dominate the other.

    The generalized Lasso with non-linear observations

    Full text link
    We study the problem of signal estimation from non-linear observations when the signal belongs to a low-dimensional set buried in a high-dimensional space. A rough heuristic often used in practice postulates that non-linear observations may be treated as noisy linear observations, and thus the signal may be estimated using the generalized Lasso. This is appealing because of the abundance of efficient, specialized solvers for this program. Just as noise may be diminished by projecting onto the lower dimensional space, the error from modeling non-linear observations with linear observations will be greatly reduced when using the signal structure in the reconstruction. We allow general signal structure, only assuming that the signal belongs to some set K in R^n. We consider the single-index model of non-linearity. Our theory allows the non-linearity to be discontinuous, not one-to-one and even unknown. We assume a random Gaussian model for the measurement matrix, but allow the rows to have an unknown covariance matrix. As special cases of our results, we recover near-optimal theory for noisy linear observations, and also give the first theoretical accuracy guarantee for 1-bit compressed sensing with unknown covariance matrix of the measurement vectors.Comment: 21 page

    Safe Functional Inference for Uncharacterized Viral Proteins

    Get PDF
    The explosive growth in the number of sequenced genomes has created a flood of protein sequences with unknown structure and function. A routine protocol for functional inference on an input query sequence is based on a database search for homologues. Searching a query against a non-redundant database using BLAST (or more advanced methods, e.g. PSI-BLAST) suffers from several drawbacks: (i) a local alignment often dominates the results; (ii) the reported statistical score (i.e. E-value) is often misleading; (iii) incorrect annotations may be falsely propagated. 
Several systematic methods are commonly used to assign sequences with functions on a genomic scale. In Pfam (1) and resources alike, statistical profiles (HMMs) are built from semi-manual multiple alignments of seed homologous sequences. The profiles are then used to scan genomic sequences for additional family members. The drawbacks of this scheme are: (i) only families with a predetermined seed are considered; (ii) the query must have a detectable sequence similarity to seed sequences; (iii) attention to internal relationships among the family members or the relations to other families is lacking; (iv) family membership is often set by pre-determined thresholds.
An alternative to profile or model based methods for functional inference relies on a hierarchical clustering of the protein space, as implemented in the ProtoNet approach (2). The fundamental principle is the creation of a tree that captures evolutionary relatedness among protein families. The tree construction is fully automatic, and is based only on reported BLAST similarities among clustered sequences. The tree provides protein groupings in continuous evolutionary granularities, from closely related to distant superfamilies. Clusters in the ProtoNet tree show high correspondence with homologous sequence (i.e. Pfam and InterPro), functional (i.e. E.C. classification) and structural (i.e., SCOP) families (3). A new clustering scheme (4) has provided an extensive update to the ProtoNet process, which is now based on direct clustering of all detectable sequence similarities. 
Herein, we use the ProtoNet resource to develop a methodology for a consistent and safe functional inference for remote families. We illustrate the success of our approach towards clusters of poorly characterized viral proteins. Viral sequences are characterized by a rapid evolutionary rate which drives viral families to be even more remote (sequence-similarity-wise). Thus, functional inference for viral families is apparently an unsolved task. Despite this inherent difficulty, the new ProtoNet tree scaffold reliably captures weak evolutionary connections for viral families, which were previously overlooked. We take advantage of this, and propose new functional assignments for viral protein families.
&#xa
    corecore