65 research outputs found

    Transcriptional network involving ERG and AR orchestrates Distal-less homeobox-1 mediated prostate cancer progression

    Get PDF
    Distal-less homeobox-1 (DLX1) is a well-established non-invasive biomarker for prostate cancer (PCa) diagnosis, however, its mechanistic underpinnings in disease pathobiology are not known. Here, we reveal the oncogenic role of DLX1 and show that abrogating its function leads to reduced tumorigenesis and metastases. We observed that ~60% of advanced-stage and metastatic patients display higher DLX1 levels. Moreover, ~96% of TMPRSS2-ERG fusion-positive and ~70% of androgen receptor (AR)-positive patients show elevated DLX1, associated with aggressive disease and poor survival. Mechanistically, ERG coordinates with enhancer-bound AR and FOXA1 to drive transcriptional upregulation of DLX1 in ERG-positive background. However, in ERG-negative context, AR/AR-V7 and FOXA1 suffice to upregulate DLX1. Notably, inhibiting ERG/AR-mediated DLX1 transcription using BET inhibitor (BETi) or/and anti-androgen drugs reduce its expression and downstream oncogenic effects. Conclusively, this study establishes DLX1 as a direct-target of ERG/AR with an oncogenic role and demonstrates the clinical significance of BETi and anti-androgens for DLX1-positive patients

    Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as You May Think -- Introducing AI Detectability Index

    Full text link
    With the rise of prolific ChatGPT, the risk and consequences of AI-generated text has increased alarmingly. To address the inevitable question of ownership attribution for AI-generated artifacts, the US Copyright Office released a statement stating that 'If a work's traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it'. Furthermore, both the US and the EU governments have recently drafted their initial proposals regarding the regulatory framework for AI. Given this cynosural spotlight on generative AI, AI-generated text detection (AGTD) has emerged as a topic that has already received immediate attention in research, with some initial methods having been proposed, soon followed by emergence of techniques to bypass detection. This paper introduces the Counter Turing Test (CT^2), a benchmark consisting of techniques aiming to offer a comprehensive evaluation of the robustness of existing AGTD techniques. Our empirical findings unequivocally highlight the fragility of the proposed AGTD methods under scrutiny. Amidst the extensive deliberations on policy-making for regulating AI development, it is of utmost importance to assess the detectability of content generated by LLMs. Thus, to establish a quantifiable spectrum facilitating the evaluation and ranking of LLMs according to their detectability levels, we propose the AI Detectability Index (ADI). We conduct a thorough examination of 15 contemporary LLMs, empirically demonstrating that larger LLMs tend to have a higher ADI, indicating they are less detectable compared to smaller LLMs. We firmly believe that ADI holds significant value as a tool for the wider NLP community, with the potential to serve as a rubric in AI-related policy-making.Comment: EMNLP 2023 Mai

    Discovery of error-tolerant biclusters from noisy gene expression data

    Get PDF
    An important analysis performed on microarray gene-expression data is to discover biclusters, which denote groups of genes that are coherently expressed for a subset of conditions. Various biclustering algorithms have been proposed to find different types of biclusters from these real-valued gene-expression data sets. However, these algorithms suffer from several limitations such as inability to explicitly handle errors/noise in the data; difficulty in discovering small bicliusters due to their top-down approach; inability of some of the approaches to find overlapping biclusters, which is crucial as many genes participate in multiple biological processes. Association pattern mining also produce biclusters as their result and can naturally address some of these limitations. However, traditional association mining only finds exact biclusters, whic

    Induction Chemotherapy Followed by Chemo-intensity-modulated Radiotherapy for Locally Advanced Nasopharyngeal Cancer.

    Get PDF
    Aims To determine the toxicity and tumour control rates after chemo-intensity-modulated radiotherapy (chemo-IMRT) for locally advanced nasopharyngeal cancers (LA-NPC).Materials and methods Patients with LA-NPC were enrolled in a trial to receive induction chemotherapy followed by parotid-sparing chemo-IMRT. The primary site and involved nodal levels received 65 Gy in 30 fractions and at risk nodal levels received 54 Gy in 30 fractions. Incidence of ≥grade 2 subjective xerostomia was the primary end point. Secondary end points included incidences of acute and late toxicities and survival outcomes.Results Forty-two patients with American Joint Committee on Cancer stages II (12%), III (26%) and IV (62%) (World Health Organization subtype: I [5%]; II [40%]; III [55%]) completed treatment between January 2006 and April 2010 with a median follow-up of 32 months. Incidences of ≥grade 2 acute toxicities were: dysphagia 83%; xerostomia 76%; mucositis 97%; pain 76%; fatigue 99% and ototoxicity 12%. At 12 months, ≥grade 2 subjective xerostomia was observed in 31%, ototoxicitiy in 13% and dysphagia in 4%. Two year locoregional control was 86.2% (95% confidence interval: 70.0-94.0) with 2 year progression-free survival at 78.4% (61.4-88.6) and 2 year overall survival at 85.9% (69.3-93.9).Conclusions Chemo-IMRT for LA-NPC is feasible with good survival outcomes. At 1 year, 31% experience ≥grade 2 subjective xerostomia

    Modeling tissue-specific structural patterns in human and mouse promoters

    Get PDF
    Sets of genes expressed in the same tissue are believed to be under the regulation of a similar set of transcription factors, and can thus be assumed to contain similar structural patterns in their regulatory regions. Here we present a study of the structural patterns in promoters of genes expressed specifically in 26 human and 34 mouse tissues. For each tissue we constructed promoter structure models, taking into account presences of motifs, their positioning to the transcription start site, and pairwise positioning of motifs. We found that 35 out of 60 models (58%) were able to distinguish positive test promoter sequences from control promoter sequences with statistical significance. Models with high performance include those for liver, skeletal muscle, kidney and tongue. Many of the important structural patterns in these models involve transcription factors of known importance in the tissues in question and structural patterns tend to be conserved between human and mouse. In addition to that, promoter models for related tissues tend to have high inter-tissue performance, indicating that their promoters share common structural patterns. Together, these results illustrate the validity of our models, but also indicate that the promoter structures for some tissues are easier to model than those of others

    CpG-depleted promoters harbor tissue-specific transcription factor binding signals—implications for motif overrepresentation analyses

    Get PDF
    Motif overrepresentation analysis of proximal promoters is a common approach to characterize the regulatory properties of co-expressed sets of genes. Here we show that these approaches perform well on mammalian CpG-depleted promoter sets that regulate expression in terminally differentiated tissues such as liver and heart. In contrast, CpG-rich promoters show very little overrepresentation signal, even when associated with genes that display highly constrained spatiotemporal expression. For instance, while ∼50% of heart specific genes possess CpG-rich promoters we find that the frequently observed enrichment of MEF2-binding sites upstream of heart-specific genes is solely due to contributions from CpG-depleted promoters. Similar results are obtained for all sets of tissue-specific genes indicating that CpG-rich and CpG-depleted promoters differ fundamentally in their distribution of regulatory inputs around the transcription start site. In order not to dilute the respective transcription factor binding signals, the two promoter types should thus be treated as separate sets in any motif overrepresentation analysis

    A Linear Model for Transcription Factor Binding Affinity Prediction in Protein Binding Microarrays

    Get PDF
    Protein binding microarrays (PBM) are a high throughput technology used to characterize protein-DNA binding. The arrays measure a protein's affinity toward thousands of double-stranded DNA sequences at once, producing a comprehensive binding specificity catalog. We present a linear model for predicting the binding affinity of a protein toward DNA sequences based on PBM data. Our model represents the measured intensity of an individual probe as a sum of the binding affinity contributions of the probe's subsequences. These subsequences characterize a DNA binding motif and can be used to predict the intensity of protein binding against arbitrary DNA sequences. Our method was the best performer in the Dialogue for Reverse Engineering Assessments and Methods 5 (DREAM5) transcription factor/DNA motif recognition challenge. For the DREAM5 bonus challenge, we also developed an approach for the identification of transcription factors based on their PBM binding profiles. Our approach for TF identification achieved the best performance in the bonus challenge
    corecore