๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๐Ÿงฌ Biology/๋ฐ”์ด์˜ค ์ฝ”๋”ฉ ๋ฌธ์ œ

[ROSALIND] DNA์˜ ์ธํŠธ๋ก (intron) ์˜์—ญ ์ œ๊ฑฐ ํ›„ ๋‹จ๋ฐฑ์งˆ๋กœ ๋ฒˆ์—ญ

by HelloRabbit 2023. 5. 27.
728x90

๋ฌธ์ œ ์„ค๋ช…

DNA์˜ ์ „์‚ฌ ๊ณผ์ •(transcription)์€ DNA์˜ ์ผ๋ถ€๋ฅผ RNA๋กœ ๋งŒ๋“œ๋Š” ๊ณผ์ •์ด๋‹ค. ์„ธํฌ์˜ ํ•ต ๋‚ด์—์„œ RNA polymerase (RNAP)๋ผ๋Š” RNA ์ค‘ํ•ฉํšจ์†Œ๊ฐ€ DNA์˜ ๋‘ ๊ฐ€๋‹ฅ ์ค‘ ํ•œ ๊ฐ€๋‹ฅ์„ ํ…œํ”Œ๋ฆฟ(template strand)์œผ๋กœ ์‚ฌ์šฉํ•ด ์ƒ๋ณด์ ์ธ ์„œ์—ด์„ ๋งŒ๋“ ๋‹ค. ์ด ๋•Œ A์˜ ์—ผ๊ธฐ์Œ์œผ๋กœ T ๋Œ€์‹  U๋ฅผ ์‚ฌ์šฉํ•˜๊ณ , ์ด๋ ‡๊ฒŒ ์™„์„ฑ๋œ ์„œ์—ด์„ precursor mRNA (pre-mRNA)๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค.

์ „์‚ฌ (Transcription) ๊ณผ์ •

Pre-mRNA๋Š” intron๊ณผ exon์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋Š”๋ฐ ๋‹จ๋ฐฑ์งˆ๋กœ ๋ฒˆ์—ญ๋˜๊ธฐ ์ „์— intron ์˜์—ญ์€ ์ œ๊ฑฐ๋˜๊ณ  exon ์˜์—ญ๋ผ๋ฆฌ ์ด์–ด ๋ถ™์€ ์„œ์—ด์„ mRNA๋ผ ๋ถ€๋ฅธ๋‹ค. ์ด๋Ÿฌํ•œ intron ์ œ๊ฑฐ ๊ณผ์ •์€ spliceosome์ด๋ผ ๋ถˆ๋ฆฌ๋Š” ๋‹ค์–‘ํ•œ RNA์™€ ๋‹จ๋ฐฑ์งˆ ๊ตฌ์„ฑ์ด ์‹คํ–‰์‹œํ‚ค๋ฉฐ ์ด ๊ณผ์ •์„ splicing์ด๋ผ ํ•œ๋‹ค. Spliceosome์— ํฌํ•จ๋œ RNA์™€ ๋‹จ๋ฐฑ์งˆ๋„ splicing ๊ณผ์ •์„ ๊ฑฐ์ณ์•ผ ํ–ˆ์„ํ…๋ฐ ๊ทธ๋ ‡๋‹ค๋ฉด ๊ฐ€์žฅ ์ฒซ spliceosome์— ํฌํ•จ๋œ RNA์™€ ๋‹จ๋ฐฑ์งˆ์€ ์–ด๋–ป๊ฒŒ splicing ๋˜์—ˆ๋Š”์ง€๊ฐ€ ์•„์ง๋„ ์˜๋ฌธ์ด๋‹ค.

DNA์˜ coding strand๋Š” RNA์™€ ์„œ์—ด์ด ๊ฐ™๋‹ค (T๋งŒ U๋กœ ๋ฐ”๊พธ๋ฉด ๋œ๋‹ค)

์šฉ์–ด ์ •๋ฆฌ๋ฅผ ์ข€ ํ•˜์ž๋ฉด, DNA๋Š” 2๊ฐœ์˜ ๊ฐ€๋‹ฅ์œผ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋Š”๋ฐ ๊ฐ๊ฐ template strand (= non-coding strand)์™€ coding strand (= non-template strand)์ด๋ผ ๋ถˆ๋ฆฐ๋‹ค. ์ด๊ฑด ์ฒ˜์Œ๋ถ€ํ„ฐ ๋”ฑ๋”ฑ ์ •ํ•ด์ ธ ์žˆ๋Š”๊ฒŒ ์•„๋‹ˆ๋ผ template strand๋Š” RNAP๊ฐ€ RNA๋ฅผ ๋งŒ๋“ค ๋•Œ ํ…œํ”Œ๋ฆฟ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฐ€๋‹ฅ์ด๋ฏ€๋กœ ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ๋ฐ˜๋Œ€์ชฝ DNA ๊ฐ€๋‹ฅ์ด coding strand๊ฐ€ ๋œ๋‹ค. ์ฆ‰, DNA์˜ coding strand๋Š” T๋ฅผ U๋กœ ๋ฐ”๊พผ๊ฑฐ ์ด์™ธ์— ๋งŒ๋“ค์–ด์ง„ RNA์™€ ์„œ์—ด์ด ๊ฐ™๋‹ค(์œ„ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ)! ๊ทธ๋ฆฌ๊ณ  exon์ด ์‹ค์งˆ์ ์œผ๋กœ ๋‹จ๋ฐฑ์งˆ์ด ๋˜์–ด ๊ธฐ๋Šฅ์„ ํ•˜๋Š” ๋ถ€๋ถ„์ด๊ธฐ ๋•Œ๋ฌธ์— DNA์˜ ์œ ์ „์ž ์˜์—ญ์—์„œ exon ๋ถ€๋ถ„์„ ์œ ์ „์ž์˜ coding region์ด๋ผ ๋ถ€๋ฅธ๋‹ค.

๋ฌธ์ œ (RNA Splicing)

DNA ์„œ์—ด๊ณผ ์ด ์„œ์—ด์˜ intron ์˜์—ญ๋“ค์ด ์ฃผ์–ด์กŒ์„ ๋•Œ, exon๋งŒ ์ด์–ด ๋ถ™์—ฌ์„œ ๋‹จ๋ฐฑ์งˆ ์„œ์—ด๋กœ ๋ฒˆ์—ญํ•˜์‹œ์˜ค. ์ฃผ์–ด์ง„ DNA ์„œ์—ด์€ coding strand๋ผ ์ƒ๊ฐํ•˜๋ฉด ๋œ๋‹ค.

์˜ˆ์‹œ

>Rosalind_10
ATGGTCTACATAGCTGACAAACAGCACGTAGCAATCGGTCGAATCTCGAGAGGCATATGGTCACATGATCGGTCGAGCGTGTTTCAAAGTTTGCGCCTAG
>Rosalind_12
ATCGGTCGAA
>Rosalind_15
ATCGGTCGAGCGTGT

์˜ˆ์ƒ ๊ฒฐ๊ณผ

MVYIADKQHVASREAYGHMFKVCA

 

ํ•ด๊ฒฐ

codon = {}
# codon์— ํ•ด๋‹นํ•˜๋Š” ์•„๋ฏธ๋…ธ์‚ฐ ๋”•์…”๋„ˆ๋ฆฌ์— ์ €์žฅํ•˜๊ธฐ
with open("aa_codon.txt", "r") as f:
    for line in f.readlines():
        aa = line.split()
        for i in range(0, len(aa), 2):
            codon[aa[i]] = aa[i+1]

# rna๋ฅผ protein์œผ๋กœ ๋ฒˆ์—ญํ•ด์ฃผ๋Š” ํ•จ์ˆ˜
def translation(rna):
    protein = ''
    for i in range(0, len(rna), 3):
        protein += codon[rna[i:i+3]]
    
    return protein.replace('Stop', '')

# ํ…Œ์ŠคํŠธ์ผ€์ด์Šค ๊ฐ€์ ธ์˜ค๊ธฐ
with open("rosalind_splc.txt") as f:
    seqs = []
    seq = ''
    for line in f.readlines():
        if line.startswith('>'):
            if seq: # ์™„์„ฑ๋œ dna ์„œ์—ด ์ €์žฅํ•˜๊ธฐ
                seqs.append(seq)
            seq = ''
        else:
            seq += line.strip()
    if seq:         # ์™„์„ฑ๋œ dna ์„œ์—ด ์ €์žฅํ•˜๊ธฐ
        seqs.append(seq)

    dna = seqs[0]   # ์ฒซ๋ฒˆ์งธ ์„œ์—ด์ด dna ์„œ์—ด์ž„
    for i in range(1, len(seqs)):   # ๋‚˜๋จธ์ง€ ์„œ์—ด์ด intron์ž„
        dna = ''.join(dna.split(seqs[i]))   # intron ์„œ์—ด ๋นผ์ฃผ๊ธฐ ์–‘์ชฝ์˜ exon ์ด์–ด ๋ถ™์—ฌ์ฃผ๊ธฐ
    
    rna = dna.upper().replace('T', 'U')     # T๋ฅผ U๋กœ ๋ฐ”๊ฟˆ์œผ๋กœ์„œ dna -> rna
    print(translation(rna)) # rna ์„œ์—ด ๋‹จ๋ฐฑ์งˆ๋กœ ๋ฒˆ์—ญํ•˜๊ธฐ

 

 

 

๋Œ“๊ธ€