-
Notifications
You must be signed in to change notification settings - Fork 59
Expand file tree
/
Copy pathmodel-template.yaml
More file actions
70 lines (66 loc) · 2.6 KB
/
model-template.yaml
File metadata and controls
70 lines (66 loc) · 2.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
type: keras
args:
arch:
url: {{ args_arch_url }}
md5: {{ args_arch_md5 }}
weights:
url: {{ args_weights_url }}
md5: {{ args_weights_md5 }}
custom_objects: custom_keras_objects.py
backend: tensorflow
image_dim_ordering: tf
info:
authors:
- name: Babak Alipanahi
- name: Andrew Delong
- name: Matthew T Weirauch
- name: Brendan J Frey
contributors:
- name: Johnny Israeli
github: jisraeli
name: DeepBind
version: 0.1
trained_on: "?All chromosomes? Data from protein binding microarrays (Mukherjee et al., 2004), RNAcompete assays (Ray et al., 2009), ChIP-seq (Kharchenko et al., 2008), and HT-SELEX (Jolma et al., 2010)"
doc: >
Abstract:
Knowing the sequence specificities of DNA- and RNA-binding proteins is essential for developing models of the regulatory processes in biological systems and for identifying causal disease variants. Here we show that sequence specificities can be ascertained from experimental data with 'deep learning' techniques, which offer a scalable, flexible and unified computational approach for pattern discovery. Using a diverse array of experimental data and evaluation metrics, we find that deep learning outperforms other state-of-the-art methods, even when training on in vitro data and testing on in vivo data. We call this approach DeepBind and have built a stand-alone software tool that is fully automatic and handles millions of sequences per experiment. Specificities determined by DeepBind are readily visualized as a weighted ensemble of position weight matrices or as a 'mutation map' that indicates how variations affect binding within a specific sequence.
cite_as: https://doi.org/10.1038/nbt.3300
license: BSD 3-Clause
tags:
- DNA binding
default_dataloader:
defined_as: kipoiseq.dataloaders.SeqIntervalDl
default_args:
auto_resize_len: 101
dependencies:
conda:
- python=3.6.12
- h5py=2.10.0
- tensorflow=1.4.1
- keras=2.1.6
- python=3.6
- pysam=0.15.3
- pip=20.2.4
schema:
inputs:
name: seq
special_type: DNASeq
shape: (101, 4)
doc: DNA sequence
associated_metadata: ranges
targets:
name: binding_prob
shape: (1, )
doc: Protein binding probability
postprocessing:
variant_effects:
seq_input:
- seq
scoring_functions:
- type: diff
{% if model == 'Homo_sapiens/TF/D00817.001_ChIP-seq_TBP' %}
test:
expect:
url: https://s3.eu-central-1.amazonaws.com/kipoi-models/predictions/14f9bf4b49e21c7b31e8f6d6b9fc69ed88e25f43/DeepBind/{{ model }}/predictions.h5
md5: 49efa1bb7794001d744b8ff1eb2ebd5c
{% endif %}