Gene locus definition. Determination of the relationship between two gene loci. Revealing adhesion. See what "Locus" is in other dictionaries

Pig breeding in Europe and America is beginning to use genomic selection. Its technologies make it possible to decipher the genotype of pigs at birth and select the best animals for breeding. This latest technology is designed to further increase the selection accuracy and reliability of the breeding value of pigs.

The ancestor of genomic selection is marker selection.

Marker selection is the use of markers to mark genes of a quantitative trait, which makes it possible to establish the presence or absence of certain genes (alleles of genes) in the genome.

A gene is a piece of DNA, a specific sequence of nucleotides, in which information about the synthesis of one protein molecule (or RNA) is encoded, and as a result, providing the formation of a trait and its transmission by inheritance.

Genes represented in the population by several forms - alleles - are polymorphic genes. Alleles of genes are divided into dominant and recessive. Genetic polymorphism provides a variety of traits within a species.

However, only a few traits are under the control of individual genes (for example, hair color). Productivity indicators, as a rule, are quantitative traits, for the development and manifestation of which many genes are responsible. Some of these genes may have more pronounced effects. Such genes are called the main genes of the quantitative trait loci (QTL). Quantitative trait loci (QTL) are DNA regions containing genes or linked to genes underlying a quantitative trait.

For the first time the idea of ​​using markers in breeding was theoretically substantiated by A.S. Serebrovsky back in the 1920s. The marker, (then called the "signal", the English term "marker" began to be used later) according to A.S. Serebrovsky is an allele of a gene that has a pronounced phenotypic manifestation, localized next to another allele that determines an economically important studied trait, but does not have clear phenotypic manifestation; thus, making a selection for the phenotypic manifestation of this signaling allele, there is a selection of linked alleles that determine the manifestation of the trait under study.

Initially, morphological (phenotypic) traits were used as genetic markers. However, very often quantitative traits have a complex inheritance pattern, their manifestation is determined by environmental conditions and the number of markers used as phenotypic traits is limited. Then, gene products (proteins) were used as markers. But it is most effective to test genetic polymorphism not at the level of gene products, but directly at the level of genes, that is, to use polymorphic DNA nucleotide sequences as markers.

Usually, DNA fragments that lie close to each other on the chromosome are inherited together. This property allows a marker to be used to determine the exact inheritance pattern of a gene that has not yet been precisely localized.

Thus, markers are polymorphic regions of DNA with a known position on the chromosome, but unknown functions, which can be used to identify other genes. Genetic markers must be easily identifiable, linked to a specific locus, and highly polymorphic because homozygotes provide no information.

The widespread use of variants of DNA polymorphism as genetic markers began in 1980. Molecular genetic markers were used for programs for preserving the gene pools of breeds of agricultural diseases, acceleration of selection according to individual characteristics - resistance to certain factors, according to productive indicators. In Europe, genetic markers have been used in pig breeding since the early 1990s. to free the population from the halothane gene, which causes stress syndrome in pigs.

There are several types of molecular genetic markers. Until recently, microsatellites were very popular, since they are widespread in the genome and have a high level of polymorphism. Microsatellites - SSR (Simple Sequence Repeats) or STR (Simple Tandem Repeats) consist of DNA sections 2-6 base pairs long, repeated in tandem many times. For example, the American company Applied Biosystems has developed a genotyping test system for 11 microsatellites (TGLA227, BM2113, TGLA53, ETH10, SPS115, TGLA126, TGLA122, INRA23, ETH3, ETH225, BM1824). However, microsatellites are not enough for fine mapping of individual regions of genomes; the high cost of equipment and reagents and the development of automated methods using SNP chips are pushing them out of practice.

A very convenient type of genetic markers is SNP (Single Nucleotide Polymorphisms) - snip or single nucleotide polymorphism- these are differences in the DNA sequence of one nucleotide in the genome of representatives of the same species or between homologous regions of homologous chromosomes of an individual. SNPs are point mutations that can occur as a result of spontaneous mutations and the action of mutagens. A difference of even one base pair can be the reason for a change in the trait. SNPs are widely distributed in the genome (in humans, about 1 SNP per 1000 base pairs). The pig genome has millions of point mutations. No other type of genomic difference is capable of providing such a density of markers. In addition, SNPs have a low mutation rate per generation (~ 10-8), in contrast to microsatellites, which makes them convenient markers for population genetic analysis. The main advantage of SNPs is the ability to use automatic methods for their detection, for example, the use of DNA templates.

In order to increase the number of SNP markers, a number of foreign companies have recently been joining their efforts, creating a unified database in order to be able, by testing a large number of animals tested for productivity for polymorphism, to reveal the presence of links between known point mutations and productivity.

Currently, a large number of polymorphic variants of genes and their mutual influence on the productive traits of pigs have been identified. Several genetic tests using production markers are publicly available and used in breeding programs. Using these markers, you can improve some of the performance indicators.

Examples of productivity markers:

  • fertility markers: ESR - estrogen receptor gene, EPOR - erythropoietin receptor gene;
  • disease resistance markers - ECR F18 receptor gene;
  • markers of growth efficiency, meat productivity - MC4R, HMGA1, CCKAR, POU1F1.

MC4R - the gene for the melanocortin 4 receptor in pigs is located on chromosome 1 (SSC1) q22-q27. Replacement of one nucleotide A by G leads to a change in the amino acid composition of the MC4 receptor. As a result, there is a violation of the regulation of the secretion of adipose tissue cells, which leads to a violation of lipid metabolism and directly affects the process of formation of signs that characterize the fattening and meat qualities of pigs. Allele A is responsible for fast growth and greater fat thickness, while allele G is responsible for growth efficiency and a large percentage of lean meat. Homozygous pigs with genotype AA reach market weight three days faster than pigs homozygous for the G allele (GG), but pigs with genotype GG have 8% less fat and have a higher feed conversion.

Also, meat and fattening productivity is affected by other genes that control a complex of coupled physiological processes. The POU1F1 gene, a pituitary transcription factor, is a regulatory transcription factor that determines the expression of growth hormone and prolactin. In pigs, the POU1F1 locus is mapped on chromosome 13. Its polymorphism is due to a point mutation leading to the formation of two alleles - C and D. The presence of allele C in the pig genotype is associated with increased average daily weight gain and higher maturity.

Also, markers allow you to test the genotype of boars for sex-limited traits, which are manifested only in sows. This is, for example, fertility (the number of piglets per farrowing), which the boar passes on to the offspring. For example, testing the genotype of a boar for estrogen receptor markers (ESR) will select those boars for breeding that will give their daughters better reproductive qualities.

Using the results of marker selection, it is possible to estimate the frequency of occurrence of desirable and undesirable alleles for a breed or line, to carry out further selection so that all animals in the breed have only preferred alleles of genes.

Rice. 1. Principle of operation of an oligonucleotide biochip

A DNA chip is a substrate with cells with a reagent substance applied to it. The test material is labeled with various labels (usually with fluorescent dyes) and applied to the biochip. As shown in the picture, the reagent substance - the oligonucleotide - binds only the complementary fragment in the test material - fluorescently labeled DNA fragments. As a result, a glow is observed on this element of the biochip.

In 2009, the pig's genome was decoded. SNP chip developed ( DNA microarray variant) containing 60,000 genetic markers of the genome. To speed up the research, special robots were even created for reading snips. A pig DNA sample can be tested for the presence or absence of virtually all important point mutations that determine productive traits. Thus, the selection of the best animals can be based on genetic markers without measuring phenotypic parameters.

These advances have led to the introduction of a new technology - genomic selection. Genomic selection is testing the genome at once for a large number of markers covering the entire genome, so that quantitative trait loci (QTL) are in linkage disequilibrium with at least one marker. In genomic selection, genome scanning is performed using chips (matrices) with 50-60 thousand SNPs (which mark the main genes for quantitative traits) to identify single-nucleotide polymorphisms along the animal's genome, determine genotypes with the desired manifestation of a set of productive traits and assess the breeding value of the animal.

The term "genomic selection" was first coined by Hayley and Visher in 1998. Meuwissen et al. In 2001 developed and presented a methodology for the analytical assessment of breeding value using a marker map covering the entire genome.

The practical application of genomic selection began in 2009.

Since 2009, the largest companies in the USA (Cooperative Resources International), the Netherlands, Germany, Australia have begun to introduce genomic selection into cattle breeding programs. Bulls of different breeds were genotyped for over 50,000 SNPs.

Hypor First Announces Full Market Genomic Selection Program, which will increase the accuracy of selection in pig breeding. It was announced in the media in June 2012 that Hypor was able to offer its customers a herd selected using Genomic Breeding Value.

The genetic company Hypor has been using genomic breeding since 2010, working in close collaboration with the Center for Research and New Technologies of the Hendrix Genetics group. Hendrix Genetics tests over 60,000 SNP markers and uses this information for DNA research. The genomic index of the genetic potential of pigs is calculated after analyzing 60,000 gene markers (snips) for the animal. In theory, if there are enough genetic markers to cover all of the pig's DNA (its genome), it is possible to describe all genetic variation for all measurable traits. A modern mathematical and genetic software for data processing is being prepared.

The genetic company Hendrix Genetics has a large biobank - it stores blood and tissue samples of breeding animals of several farms and generations for DNA research (revealing the genetic value of animals) and analysis of the genotype of animals. Hypor has been performing pig DNA testing at its breeding facilities for over two years. All samples from different breeding plants located in different countries are sent for processing to the new central Hendrix Genetics Genomic Laboratory in Plufragan (France). Gerard Albers, Director of the Center for Research and Emerging Technologies, stresses: "The Genome Lab is a valuable asset used by all the genetic companies that make up Hendrix Genetics, and truly unique in the pig industry."

Genomic selection is a powerful tool for the future. At present, the effectiveness of genomic selection is limited by the different nature of the interaction between the loci of quantitative traits, the variability of quantitative traits in different breeds, and the influence of environmental factors on the manifestation of the trait. But the results of studies in many countries have confirmed that the use of statistical methods in conjunction with genomic scanning increases the reliability of the forecast of breeding value.

The selection of pigs using statistical methods for some indicators (for example, disease resistance, meat quality, fertility) is characterized by low efficiency. This is due to the following factors:

  • low heritability of traits,
  • great influence on this sign of environmental factors,
  • due to manifestation limited by sex,
  • manifestation of a sign only under the influence of certain factors,
  • when the onset of the trait occurs relatively late,
  • due to the fact that characteristics are difficult to measure (for example, health characteristics),
  • the presence of hidden carriers-signs.

For example, such a defect of pigs as stress sensitivity is difficult to diagnose and manifests itself in increased mortality of piglets under the influence of stress (transportation, etc.) and deterioration in the quality of meat. DNA testing using gene markers makes it possible to identify all carriers of this defect, including hidden ones, and with this in mind, carry out selection.

To assess productivity indicators that are difficult to predict by statistical methods, for a more reliable assessment, an analysis of the offspring is needed, that is, it is necessary to wait for the offspring and analyze its breeding value. And the use of DNA markers makes it possible to analyze the genotype immediately at birth, without waiting for the manifestation of a trait or the appearance of offspring, which significantly speeds up selection.

The index assessment of animals is carried out according to the exterior and productive qualities (early maturity of piglets, etc.). In both cases, phenotypic indicators are used, therefore, to use these traits in calculations, it is necessary to know their heritability coefficient. However, even in this case, we will be dealing with the probability of the genetic justification of any trait, the averaged indicators of its ancestors and descendants (there is no way to determine which genes the young animal inherited: the best or the worst of this average). With the help of genotype analysis, it is possible to accurately establish the fact of inheritance of certain genes already at birth, to evaluate genotypes directly, and not through phenotypic manifestations.

However, if pigs are selected for indicators of high heritability, such as easily quantifiable teat numbers, genomic selection will not bring significant benefits.

Marker breeding does not deny traditional approaches to determining breeding value. Statistical analysis and genomic selection technologies complement each other. The use of genetic markers allows to speed up the process of animal selection, and index methods - to more accurately assess the effectiveness of this selection.

Genomic breeding is an opportunity to make pig production an accurate production. The use of genomic selection technologies will make it possible to produce a variety of meat products that meet the needs of consumers.

Republishing materials from this site is allowed only if you specify a hyperlink to the source of information!

LECTURE 1. Classical and molecular genetics. Basic concepts: trait, phenotype, genotype, gene, locus, allele, homozygote, heterozygote, hemizygote.

ICG SB RAS and FEN NSU, Novosibirsk, 2012

1.1. Classical and molecular genetics

Today's lecture is an introductory one, we will move on to specifics later. As in the case of almost any science, the boundaries of genetics are rather difficult to delineate, and a very general definition " genetics - the science of heredity"- not particularly fruitful. Zhimulev, for example, once said that now genetics is present everywhere - in medicine, forensics, the theory of evolution, archeology, and in genetics itself, even nucleic acids are almost invisible - all protein interactions. Thus, he actually put an equal sign between genetics and all modern biology. On the other hand, for approximately the first two-thirds of the twentieth century, genetics was perhaps the most isolated and well-defined area of ​​biology, standing out primarily for its synthetic methodology, in contrast to the analytical methodology of other branches of biology. To find out about the structure of her object, she did not divide it into parts, but judged the parts in an indirect way, by observing the whole (namely, by observing the behavior of characters in crosses) and relying on mathematics, and she was convinced of the correctness of her conclusions, receiving living organisms with predicted properties. Thus, genetics from the very beginning had the ability to create something new, and not only describe the observed. At the same time, in the second half of the twentieth century, molecular biology was rapidly developing - a science at first purely analytical, splitting into parts. However, its progress was carried out largely by genetic methods - remember at least that the genetic code was established in the experiments of Benzer and Crick using mutations in bacteriophages. However, in this case, the genetics of microorganisms was used, and the progress of "classical" genetics has always been associated with the genetics of eukaryotes.


As a result, molecular biology gained almost comprehensive knowledge of what and how a living organism is made of. The subjects of molecular biology and genetics overlapped in many ways: both of them studied the transmission and implementation of hereditary information (and a living organism is the implementation of hereditary information), however, they moved towards understanding this subject from opposite sides - genetics “outside”, molecular biology “from the inside ".

In the last third of the twentieth century, molecular biology and genetics, so to speak, met, including in the study of eukaryotes. The speculative objects of genetics have turned into completely specific physicochemical objects of a known structure, and molecular biology has become a synthetic science, capable of influencing at its own discretion even the highest multicellular organisms - take at least genetic modification. Here the boundaries of genetics as a science were erased to the point of indistinguishability - it became impossible to say where molecular biology ends and genetics begins. Moreover, to designate the resulting synthetic science, the term "molecular genetics" appeared, as a result of which it became difficult to understand what exactly remained in genetics outside the latter. Genetics of the premolecular period, with all its approaches based on crosses and the theory of probability, was awarded the honorary title of "classical genetics". On the other hand, with this title, she was, as it were, sent into honorary retirement. You can recall how Watson and Crick refused to discuss their model of the structure of DNA in their article in Nature, because the consequences from it were too large and obvious. At some point, it might seem that all genetics follows from this model.

A paradoxical situation is developing. All genetics courses begin with the history of this science. It examines how Mendel worked with peas, what he received and how he interpreted it based on his knowledge, then how Morgan and his school worked with fruit flies, what they received and how they interpreted it. It is impossible to omit both of these topics - Mendel shows us an example of a person who developed from scratch and brilliantly applied a genetic methodology based on mathematics, and the Morgan school in the first three decades of the twentieth century developed the chromosomal theory of heredity and, in fact, all classical genetics. Further, genetics courses can be divided into two large classes. Some of them work out in detail the entire history and internal logic of the development of this science, demonstrating both the power of its methodology and the capabilities of the human mind in speculative penetration into the depths of things. Other courses, quickly skipping this historical period, proceed to molecular genetics and there they consider what is known at the moment about the structure and work of genes. In fact, both types of courses place classical genetics in the past and differ only in flashback detail. It turns out that classical genetics has, as it were, only historical significance. However, its powerful methodology has not gone anywhere and is necessary for a very wide range of research. If we look at the articles that have completely molecular biological names and published in the best journals, we will see that they are all based on extensive material concerning hundreds of individual mutations and their combinations, taking into account the relationship between the nature of mutations and the phenotype that they cause. This is true both for Drosophila or mice, for which huge genetic collections have been collected and special laboratory lines have been created (some about a hundred years ago, others recently), and for humans, where a huge amount of medico-genetic - in fact, population- genetic - data associated with hereditary diseases. And the richer this arsenal of knowledge and model organisms, the more elegant the work. All these more than serious studies are impossible without the simultaneous mastery of the methodology of classical and molecular genetics. Therefore, it is best to study these "two genetics" in parallel, no matter how difficult it is to organize.


In modern science, one can observe examples of how disregard for "outdated" classical genetics leads to curiosities. For example, a group of European scientists needed to obtain a heterozygote for translocation in peas. (I am now speaking on the basis that you have some idea of ​​what is being discussed. If it does not exist, it does not matter, we will still consider all this in almost too much detail; for now, we are talking about the need for genetic knowledge). They obtained it through the fusion of protoplasts of the parental lines. Regeneration from cell culture in peas is extremely difficult, it is an extremely laborious way. Why did they do it? Apparently, they thought that carriers of the translocation did not interbreed with common peas! In fact, problems with reproduction when crossing parents differing in translocation do arise, but only in the next generation and consist in the loss of only half of fertility.

But these scientists at least needed a heterozygote. Meanwhile, the general fascination with molecular biology and disregard for classical genetics leads to the fact that the existence of heterozygotes - that is, the fact that in eukaryotes each gene is represented in two copies, which may be different, or may be identical - are often completely forgotten. For example, an article by German authors came for my review, in which they directly read a certain non-coding DNA sequence from 38 dragonflies caught in different regions (Western Europe, Western Siberia, Japan and North America) and found 20 variants of it. It was written as if only one variant was found in each individual. However, if the variability is really as high as they claim, then the probability that there is at least one individual in their sample, in which both copies of this sequence turn out to be the same, is not very different from zero. And it was not even discussed in any way. After the review, they wrote that in five cases there was a suspicion of heterozygosity. If there really are only five, then they had in their hands an amazing phenomenon of the transformation of heterozygotes into homozygotes by means of yet unknown mechanisms, but they did not seem to even understand this.

Reconstructions of phylogeny based on certain DNA sequences are now widespread. So, quite often attempts are made on the basis of the divergence time between populations to judge whether these populations belong to the same biological species or to different ones. (Note that it is the time of divergence that is estimated, since the genes under study, the variability of which is more or less constant in time, are certainly not the genes with a change in which speciation could be associated). Meanwhile, the time of divergence generally has little to do with this problem - the moments of acquisition by a certain local population of reproductive isolation, that is, moments of speciation, occur under certain conditions and usually do not take much time from a paleontological point of view (tens to hundreds of thousands of years), then how populations can diverge for a long time without speciation. The question is precisely to find out if there is reproductive isolation between populations (at least potential). To do this, one should see whether there is an exchange of genes between them (if it is physically possible) or not. Here, it is just very important to find out whether heterozygotes are present at the junction of populations for the alleles characteristic of each of them, and what is their frequency. But almost no one does this, and whether a population belongs to the same or to different species is judged by the level of differences between them, comparing them with the differences in those cases that are assumed to be certain.

In general, if a single organism (for example, as a representative of its own species) can be investigated by the methods of molecular genetics, then as soon as it comes to a multitude of organisms, that is, a population genetic problem arises - and such a problem arises quite often, for example, in population biology and selection - here one cannot do without the approaches of classical genetics. Classical genetics is indispensable in everything that concerns individual differences and characteristics of many individuals of the same species. This is precisely its element, and it is in it that those current scientists who have replaced the classical genetic education with molecular biological ones often turn out to be helpless.

Based on the foregoing, I see my task in presenting classical genetics not so much in the historical aspect, following the great scientists of the past, as starting from the current state of science, in particular, without abstracting from the knowledge that you have already received in the courses of molecular biology and cytology. At the same time, some patterns discovered as purely empirical at the level of organisms acquire a completely natural interpretation at the level of molecules and look almost trivial. At the same time, one should have a clear idea of ​​these laws themselves, since they should be used at the level of organisms. In a sense, such a course in genetics is thought of as something like "demonstration of tricks with revelations" - where the trick itself and its background are equally "medical facts." Such a course would be designed to teach a very productive methodology: to descend from a trait to genes and, through understanding the mechanism of their action, rise back to the synthesis of new traits.

As you already understood, at the moment the content of genetics is huge and heterogeneous, so the time allotted to us is unlikely to be enough even for a short acquaintance. This forces us to leave the history of genetics behind the scenes as an independent topic to which a special course should have been devoted.

Unfortunately, none of the available textbooks corresponds to the ideal of the study of genetics outlined above at the present stage - from trait to gene and vice versa - most likely because this science is developing too quickly now. As some compensation for this circumstance, I will try to upload my modest lectures on my own site, where they will be available to those to whom I give their address - that is, you. I would recommend taking Vechtomov's textbook "Genetics with the basics of selection" as a basis. The textbook of academician Igor Fedorovich Zhimulev "General and molecular genetics" is also well known, in which the main emphasis is on molecular genetics, and Leonid Vladimirovich recommends it as a basic textbook. I understand that two basic textbooks are not the most convenient situation for passing the exam. But on the other hand, it contributes to the understanding of the subject. I can say that I personally am here and generally work at the Institute of Cytology and Genetics solely because I attended a course in genetics by Vladimir Aleksandrovich Berdnikov. This was the best course that I have heard, and it did not correspond to any textbook at all, because V.A. Igor Fedorovich also turned his original course of lectures into a textbook.

We will touch on the basics of genetics very thoroughly in order to get a good feel for them. We will start from the very beginning, in spite of the fact that the most elementary foundations of genetics are taught at school, so God forbid not to miss something simple, but important. On the other hand, I am dealing with college students who have already completed a course in molecular biology and are currently studying the theory of probability and mathematical statistics, which allows me not to be too distracted by the materials of these courses, which are so necessary for studying genetics. For example, I will assume that you know (or will know at the right time) what alternative splicing or Poisson distribution is.

The standard logic of the presentation of biology in university courses is to move from the bottom up, from atoms to molecules and macromolecules, then to the structures of the cell, to the life of the cell itself, and then to the multicellular organism. When we know the principles of organizing life to the end, this order of presentation turns out to be organic and natural. These principles also include the mechanism of functioning of nucleic acids as a carrier of information, primarily about various proteins and functional RNAs (which, after the discovery of small RNAs, turned out to be more diverse than previously thought), and not only about their structure, but also about when, where and how many of these or those RNAs or proteins should be synthesized. These processes are controlled again with the help of certain proteins (and often RNA). There is a cascade principle of the unfolding of genetic control systems - genes encode proteins (RNA) necessary to control genes that encode other proteins (RNA), etc. Since almost everything in the body in general is “made” by proteins (plus some RNAs) , it turns out that in fact, in nucleic acids, information about the whole organism is recorded in general - however, reading this information is impossible without previously synthesized (again, using the DNA matrix) proteins, which operate with DNA.

This order of presentation completely coincides with the order in which life itself developed. At first, these were some kind of "simple" (but only in comparison with what later arose from them) systems of self-reproducing macromolecules, apparently, nucleic acids. Then they happened to surround themselves with a phospholipid membrane, which allowed them to build their own microcosm within it. This is how cells came into being. Proteins played an increasingly important role in the functioning of these first living things, but control over nucleic acids was fully preserved. The cells became more complex and learned to divide more and more correctly. After division, they sometimes did not diverge, forming colonies. These colonies faced more and more complex problems due to their size and shape - all the cells in the colony had to be supplied with everything they needed to live. The solution to these problems was achieved due to a certain structure of the colonies and the division of labor between their constituent cells. Simple colonies have turned into cell states, that is, into multicellular organisms. The problems of their self-reproduction as complex structures were also solved, and this was realized in such a way that each organism could develop from one cell through the deployment of a complex genetic program that regulates cell division and interactions between them.

However, this standard order of presentation of biological knowledge is distracted from how it was obtained. And they were obtained as science developed in the opposite direction - from organisms to organs, cells, macromolecules and atoms. As they dived into each of these levels, scientists could only make guesses about how the deeper level works. Once the maximum they could do was to open the body, look at the organs and guess how they work. When the cells were opened, they were initially believed to be filled with emptiness. Then they discovered protoplasm, but at first they saw it only as a viscous liquid, which, however, somehow mysteriously contained the essence of life. Discovered the nucleus and organelles of the cell. We found dyes that color them differently, and thus approached their chemical composition. At the end of the nineteenth century. discovered nucleic acids and found out their approximate chemical composition, but their specific structure remained a mystery for a long time, the solution to which looked so brilliant. At this, the immersion into the depths of biology, perhaps, stopped. A period of accumulation of particulars at this deep molecular level has come. There were an unusually large number of particulars. Now we are going through a period when this huge number of particulars are beginning to be combined into a kind of coherent picture - a model of the structure of a living organism. Moreover, this model is so complex that it can not be fully perceived by human consciousness, so that not only its construction, but also a visual description and use is impossible without modern computers. Nevertheless, by the end of the twentieth century. all the basic principles of biology were discovered. Classical genetics by the efforts of several talented scientists developed almost in full form during the first three decades of the twentieth century as a harmonious and logical science.

Classical genetics is just a vivid example of the movement of the researcher from the macrolevel to the microlevel. She reconstructs the scheme of the system according to its behavior, approaching it like a black box. As if the alien mechanisms of an unknown device without any diagrams and instructions for them fell into the hands of scientists. Two of its main features can be noted. First, this is the amazing depth of reconstruction, which it achieved with a lack of direct information about the structure of the object. The power of the classical genetic approach is impressive: dealing only with visible traits, it made it possible to create an idea about intelligible genes, about their placement in some mysterious linear carriers, about changes in genes and these carriers. Based on the picture of the inheritance of traits, with its help, an idea was obtained about the structure of carriers of genetic information, the transfer of this information to descendants and its transformation into living flesh. The second feature is the already mentioned synthetic rather than analytical nature of genetic knowledge, the justice of which, in the very process of obtaining it, was immediately embodied in the creation of something new - organisms with new characteristics. It is enough to have a well-studied genetics of a few model objects, then the rest of the objects can be judged by the degree of similarity with them. The well-known aphorism of Thomas Morgan “what is true for a fly is true for an elephant”, of course, is a rather strong exaggeration, and we will be convinced of this. However, this approach (which also found its expression in the so-called law of homological series) still works.

The main method of classical genetics is crossbreeding. Genetics came to most of their conclusions by observing the behavior of the traits of parents and offspring, and the actions of the researcher with each new generation are determined by the results obtained in the previous one. Therefore, genetic research is a bit like a chess game. The conclusions drawn from such studies were extremely detailed and, as shown by the further development of science, are correct. Gregor Mendel in his experiments on peas at the end of the 19th century. actually postulated existence and described the behavior of chromosomes in meiosis, without having the slightest idea about chromosomes. The relationship of genes to chromosomes was established only at the beginning of the 20th century, and almost until its middle it was strongly suspected that proteins are the material carrier of heredity. In other words, if other branches of biology did not really break away from the descriptive approach, then genetics in its models was far ahead of the time when the objects studied by it could be described as material entities. In the tragic period in the history of Russian science, which fell under the ideological dictate in the 30-50s of the last century, this gave rise to declaring genetics an idealistic pseudoscience and throwing our country, which was at its forefront, far back, and physically destroying the best geneticists.

Such a cognitive power of classical genetics as a science, capable, based on the behavior of characters in crosses, to draw correct conclusions about the behavior of certain microstructures of a cell, even without having an idea of ​​what they consist of, is primarily due to the fact that genetics includes a lot of mathematics from its various industries. And this circumstance owes its existence to the fact that the object of genetics is not some biological structure, but information. Information can be studied regardless of what material medium it is implemented on. Thus, a programmer in his work does not need to know how exactly his program will be embodied in the state of crystals in a computer processor, although he is aware that it will be implemented on this physical basis. Genetics is essentially biological informatics. Informatics used to be called cybernetics. And this was another "pseudoscience" that was persecuted under Stalin and Khrushchev, with all the difference between them. (Fortunately, it was not as developed as a branch of mathematics at that time as genetics was in the form of biology, and as a result, less damage was caused by this company).

"Classical" genetics(sometimes called Mendelian, although what is meant is much broader than what Mendel discovered, and here the notorious ideological stigma "Mendelism-Morganism" would be more appropriate) can be defined as the science of heredity, operating with abstract elements of the developmental control system of the organism, being distracted from their material carrier and essentially not needing it. Respectively, molecular genetics can be defined as the science of the molecular mechanisms underlying heredity... I hope it would be superfluous to call not to attach great importance to these and similar formal definitions. In real scientific practice, "two geneticists", and even more so the border between them does not exist, and the above definitions themselves only indicate the general direction of thought ...

However, it is known that any definition of anything is imperfect, since our thinking is not mathematical logic and concepts - what our thinking operates with - cannot be reduced to words - with the help of which we fix and communicate with certain losses the results of thinking. Concepts are possible only understand(with varying degrees of clarity), observing their interactions with earlier witnesses concepts on a set of texts, where concepts are denoted by words. The definition is just the most concise and effective text that brings you closer to understanding, but there are always situations where any definition does not work (even though the concepts do). Where possible, I try to give definitions that seem to me the best, not caring too much about how they correspond to previously proposed or original ones, but I do not take them too seriously and very far from the idea that taking them down under dictation and memorizing them can make it easier understanding of the subject.

At first, genetics consisted of the lonely feat of a single scientist who was not understood by any contemporary and who, due to his personal genius and versatile education, himself proposed a fruitful methodology, and meticulously conducted long and extensive experiments and made an unobvious speculative assumption. Soon after the rediscovery of genetics, that is, its emergence already as a science of many, it was discovered that heredity factors are located in a strictly defined order and at a certain distance from each other in several linear structures, the number, relative size and behavior of which coincided with the number, relative the size and behavior of chromosomes in meiosis. The chromosomal theory of heredity was formulated in 1900-1903. American cytologist William Setton and German embryologist Theodore Boveri and was further developed by the famous American geneticist Thomas Morgan and his school - Möller, Stertevant, Brigdes. (This was the first time, since 1906, they began to conduct research on fruit flies, and at first they planned rabbits, but this plan was not missed by the financial manager of their university. Charles Woodworth was the first to cultivate fruit flies, he also suggested that it could become a convenient object for the study of heredity.) And this important conclusion about finding the factors of heredity in chromosomes, obtained so early, was rejected by official science in the USSR from the late 1940s to the early 1960s!

Comparison of speculative genetic maps (the relative position of genes in these structures) and different parts of chromosomes made it obvious that genes are located in them. But this is not so necessary for classical genetics - its models, tested by the results of crosses, put genes in a kind of "virtual chromosomes". So, to this day, for most objects, there are two kinds of chromosome maps: physical cards, showing exactly where genes are located on chromosomes visible through a microscope or on a DNA molecule, and genetic, or recombination maps reconstructing the mutual arrangement of genes based on the results of crosses. The order of genes in these two types of maps completely coincides, the relative distances between them are far from always, and there are quite comprehensive explanations for this, which will be discussed later.

As a science of information and control, classical genetics even has a structure similar to mathematics. It all rests on a system of speculative a priori concepts with which the observed phenomena are correlated (in contrast, for example, from cytology, the conceptual apparatus of which is introduced on the basis of empirical facts visible to the eye). Unfortunately, in the terminology corresponding to these concepts (and the concepts and terms are not the same), a certain inconsistency has accumulated during the existence of genetics, on which I will specially focus so that you are not misled by various usage in the genetic literature. Of course, genetic concepts are introduced on the basis of observable facts. But the main ones are introduced rather as speculative mathematical concepts. There are many concepts and terms corresponding to them in genetics. But they are really needed, and when introduced, they practically exhaust the subject. In many cases, it is enough to compare the observed phenomenon with a suitable concept, and everything becomes clear. Perhaps, as a textbook on genetics, a good explanatory dictionary genetic terms. It would be more pedagogically correct to introduce the conceptual apparatus and terminology as and when the need arises. But there is no harm in introducing and discussing basic concepts from the beginning, and then marking the places where they are needed. We will proceed from the fact that you are already familiar with some concepts at least from the school course and sometimes use them even before they are discussed in detail.

1.2. Signs of organisms. Phenotype and genotype.

Perhaps the most important genetic concept is sign... Genetics as a science began exactly at the moment when Gregor Mendel began to analyze individual traits, and not all heredity as a whole. Tell me, what is a sign? And how many of them can there be? A trait is anything associated with an individual, as long as there is a way to register it somehow. Height, weight, color, crying height, half the length of the tail added to the square root of a third of the length of the nose, the number of hairs in the beard, the shape of the burrow or anthill, the number of males pursuing one female, the length of time during which you can not breathe underwater , the number of lovers of the mother or daughter of the studied subject. I'm not joking - among the signs of carriers of a certain variant of one of the dopamine receptors, there is a high frequency of the sign "grew up without a father" (it is clear that here it was more about a sign of one of the parents, and not the subject under discussion, who, however, could inherit a predisposition ).

The choice is huge, but the more successful, wiser or witty you choose a sign, the more information you will learn from experience. It is clear that you should not add the square root of the length of the nose to the length of the tail, since both lengths have the same dimension, and as a result you will get a mathematical gibberish. But if we add the cube root of body mass to the length of the tail, then it makes more sense, because the mass depends on the cube of linear dimensions and, having extracted the cube root, we get a value commensurate with the length of the tail, and adding the two mentioned quantities, we get a certain measure linear dimensions.

It is easy to understand that not all signs of their infinite variety are equally informative. Some are equally informative, but add nothing to each other. For example, if we take two such signs: the length of the right leg and the length of the left leg, then it is even intuitively clear that although the two legs may differ slightly in length, the second will add little to the first. Let's take such signs: the length of the left leg and height. What can we say about them? The more the height, the longer the legs are - this is quite obvious. Height and leg length correlate - no more, but no less. Indeed, if we take a sample of people, measure the height and length of the legs and calculate the correlation coefficient, then it will be quite close to one and highly reliable. But we know that people are actually short-legged and long-legged. And if we take the height and the ratio of the length of the legs to the height, then we get two completely independent traits - linear dimensions and leggy, which can be inherited independently.

We now have a ratio of two measured quantities. As a rule, working with many features at once requires correct statistical processing. For this kind of processing, it is not very convenient to deal with relationships. But then there is a set of mathematical methods called multivariate statistics(in particular, principal component analysis for quantitative features), which allows us to obtain N new features from any N measured by us, which are linear combinations of the original ones (their sums with different coefficients), which will not correlate with each other. This means that each of them will carry independent information. And if we look at how N of these new features are composed, we will see that one of them reflects, for example, linear dimensions (this will include all sorts of lengths of the body, arms, legs, etc.), the other - the thickness, the third - uneven thickness (the severity of the waist, hips and bust), the fourth - the relative size of the head, the fifth - the dark skin, etc. Such signs are the most informative, and they have a different contribution to the total variability of objects, which can also be estimated. However, multivariate analysis methods do not solve the problem of duplication of features, since duplication affects the mentioned relative contribution to the total variability of the new feature in which they fall. This problem has not been solved in mathematical statistics until now.

Signs can be very different, but they fall into two large classes - quality, or alternative, and quantitative, or continual... A trait is qualitative in the case when the variability manifests itself in the existence of several alternative variants of the trait, that is, in the belonging of an individual to a certain clear class, and its assignment to one of the classes is beyond doubt. For example, there are two such classes of human individuals as men and women. Women can also be divided into several alternative classes. Suppose a girl is dressed in trousers or her legs are dressed in a single cylindrical piece of material - a dress or a skirt. We get two classes. The latter case can be divided into two classes - wearing a dress or a skirt. We get three classes of women. Women can certainly identify many alternative clothing classes without having the slightest difficulty in classifying them. Classic examples: pea flowers - white or purple; fruit fly eyes - again white or purple; funny, but both organs can also be pink, and this is another condition of a qualitative trait, a separate class. In cases where it is possible to distinguish qualitative (alternative) characters, and individuals belonging to different classes (variants) are regularly found in nature, it is customary to talk about polymorphism, and the classes (variants) of these features are usually called morphs, or forms Initially this is the same word, in Greek and Latin, but the meaning of the second is too ambiguous, and it is better to avoid it. Etymologically, both words denote form, but as terms are applied to any trait, for example, associated with color. Shown below are two morphs - respectively with yellow and purple flowers - of the Altai high-altitude violet, occurring in nature with approximately equal frequency.

https://pandia.ru/text/78/138/images/image002_73.jpg "width =" 283 "height =" 311 src = ">. jpg" width = "347" height = "453 src =">

Since we all went to school, we can suspect that white and purple iris are homozygotes for some alleles, and lilac is heterozygous for these alleles. But we (in particular, I) do not yet have such information, and in any case, we must start with a statement of three color morphs.

We mentioned three distinct color classes of pea flowers - white, purple, and pink. But on the street Zolotodolinskaya apple trees with purple petals grow. And there are apple trees with pink, slightly pinkish and white petals. In the case of carnations sold in stalls, it seems to us that the sign of flower color is qualitative - there are red, white, pink and white with red edging of the petals. And flower breeders probably have such a variety of carnations that the trait turns into quantitative. You can take a spectrophotometer, extract the anthocyanin pigment from a standard sample of petals and measure the intensity of the purple anthocyanin coloration, expressed as a number. And then we get quantitative feature- this is a sign that can be expressed by a real number. One and the same feature in different situations can act as both quantitative and qualitative. For almost any qualitative feature, you can find a way to measure it and thus consider it as quantitative. On the contrary, most quantitative characteristics cannot be regarded as qualitative, since the values ​​of the measured parameter are rarely grouped into clearly distinguishable classes.

Human height (if we exclude obvious dwarfism) is a typical quantitative sign. How many growth options are there for a normal person? That's right, you can't say - this is a positive real number, and the number of "options" depends on the accuracy with which we measure and what physical limits of this quantity exist. The growth of many people can be characterized by its average value. But we also need some characteristics of its variability. To do this, we will have to study the frequency distribution of the quantitative trait. Another textbook example: if you take a lot of people, measure their height to the nearest centimeter and build them in height so that people with the same height stand in one column, then we get the following picture: the length of the columns forms a kind of bell-shaped curve. With sufficient granularity in measuring height and the number of people, it will reproduce well what is well known in the theory of probability - normal or Gaussian distribution.

Variance "href =" / text / category / dispersiya / "rel =" bookmark "> variance - the average square of the deviations of individual values ​​from the mean. The square root of this value gives standard deviation, its dimension coincides with the dimension of the measured value and it can serve as a measure of the spread of the attribute. In the range of values ​​of the value from the mean minus the standard deviation to the mean plus the standard deviation, there are about 70% of all normally distributed objects, no matter how many we measure them. If this interval around the average is expanded twice, then there will be about 90%, if three times, then about 99% of the objects.

The central limit theorem of mathematical statistics states that the distribution of the sum of a large number of independent random variables approaches normal. And almost any quantitative trait is formed under the influence of a large number of multidirectional and different in strength factors (this is especially true for body size). That is why most of the quantitative characteristics are subject to a normal distribution.

However, this statement is true only in the first approximation. As you know, in order to assess the acceptability of a model, it is imperative to pay attention to the boundary conditions. The normal distribution is symmetric and is given on the entire set of real numbers, from - to +, although the probability density decreases rather quickly with distance from the mean. Let us return as an example to the attribute "human height". Indeed, we do not have a hard upper limit on a person's height, and no matter what record holder we find, there is never a guarantee that sooner or later there will be no higher subject. But there is a lower limit even theoretically - after all, a person's height, by definition, cannot be less than zero. This means that the boundary conditions do not allow the Gaussian model for human growth. Moreover, if we take many people, we find that their distribution in terms of height is slightly asymmetric and skewed to the right - the physical lower limit at zero makes itself felt! What model can we propose instead of the Gaussian one as more adequate for the quantitative characteristics of biological objects?

Let's think about this. Signs are formed in the course of the individual development of the organism, which in fact is a very complex chemical reaction that occurs under the control of genes, which at certain times provide certain concentrations of certain substances. These concentrations act as factors in the equations of the rates that make up the individual development of reactions (for example, the Michaelis equations), and the values ​​of the signs directly depend on some of these (or even all) rates. Therefore, the individual contributions of individual genes to a quantitative trait usually do not add up, but multiply, that is, each gene increases or decreases the value of a trait by several times. The product of many independent random variables tends to lognormal distribution... As a result, the real distributions of quantitative traits of organisms are not normal, but lognormal. They are indeed very similar, but still somewhat asymmetrical - flatter to the right.

https://pandia.ru/text/78/138/images/image007_23.jpg "width =" 304 "height =" 416 src = ">

Normal (A and B) and dwarf (C) peas

It is this trait — the relative length of the internode — that is here an alternative trait, while plant growth very rarely behaves like a real alternative trait.

There is one more conditionally distinguished class of features, about which you need to have a clear idea. Let's take such a sign as the number of branches on the horns of a maral. The smallest horns are unbranched. In the maximum case, we have 10 branches on both horns. We will not experience difficulties in classifying this or that horn to a class with a certain number of processes, and on this basis we may think that this is a qualitative sign. But quality here correlates with an integer, and the number of classes, like a series of integers, is unlimited (no one can guarantee that sooner or later we will not come across a maral with 11 or more offshoots). Such signs are called counting; they are also called meristic, which can be confusing, since we do not need to measure here, but to count. In fact, there is a simple pattern here - the larger the horn, the more branches it has; just in order for the process to be added, the horn bud needs to gain some critical mass gain. So the countable number of appendages is just a measure of the size of the horn. In the case of the number of cells on the wing of a dragonfly, this becomes even more obvious. We get the same measure when measuring, when we stop at some of its accuracy. Imagine if we are counting not the horns of a deer, but the hairs on its young horns. In fact, we have various measures the size of the horn, but with different steps (rounding).

They operate with counting features using the same approaches as with quantitative ones, with some features of mathematical processing. And it would be a mistake to apply to them the same approaches that are used for alternative signs. For example, one Moscow group of scientists studied the number of cells in certain zones of the wings of dragonflies. They calculated the average number of cells, determined the mean and standard deviation, and, for example, found that in two different bodies of water, these means were statistically significantly different. They concluded that the populations on the two lakes are genetically specific - on the basis that alternative traits must necessarily be determined by hereditary factors, one or a few. But then they operated with their trait as a quantitative one! Most likely, in one of the reservoirs, dragonflies developed in less favorable conditions and had a smaller wing area, on which fewer cells could fit, the size of which was rather standardized in ontogeny.

Finally, a third large class of features is often distinguished - rank features. We are talking about those cases when we can rank objects on the basis of "more" / "less" ("better" / "worse"), but we do not have a direct opportunity to express this quality of superiority of some over others numerically. The situations in which rank signs arise are quite diverse. On the parade ground, we can easily build soldiers by height, without measuring their height, in the same place, by shoulder straps, we can easily recognize military ranks, knowing in advance in what order they are ranked relative to each other. In some cases, we are forced to subjectively evaluate certain complex integral parameters, for example, the “strength” of individual plants, classifying them into “strong”, “medium” and “weak”.

It is curious that as soon as we have ranks, we already have a rough numerical measurement of a trait, albeit a very approximate or subjective one. Thus, ranks, being ordinal numbers, are integers in themselves. And it is already possible to operate with them as with measurable characteristics. With all the conventionality of such a "measurement", developed mathematical methods, allowing to obtain very reliable conclusions on their basis. Moreover, even undoubted qualitative features can be very roughly processed as quantitative. For example, if we have four color morphs, then we can consider them not as one qualitative trait, but as four quantitative traits, each of which can take two values ​​- 0 (an individual does not belong to this morph) and 1 (an individual belongs to given morphe). Experience shows that such artificial "quantitative features" can be successfully processed.

As examples with the growth of peas show, the same trait can be both quantitative and qualitative. Any quality we distinguish can always be somehow measured (even being male and female can be measured as the ratio of certain hormones). The choice of how to operate with a feature - as with the value of a numerical parameter or as an indicator of belonging to a class - is dictated by the characteristics of a particular problem. In the case of a bimodal distribution, it is useful to divide all individuals into two classes, at least in the first approximation, even if the two humps of the distribution merge and we cannot unambiguously classify the individuals that fall between them, except by introducing a formal threshold value.

Both qualitative and quantitative traits can be inherited to one degree or another, and therefore fall into the field of vision of genetics. Genetics uses different models to analyze quantitative and qualitative traits. The inheritance of qualitative features (it was with them that Mendel worked) is described in terms of combinatorics and probability theory in a simpler and more accurate way, and we will mainly deal with it. The inheritance of quantitative traits is described in terms of mathematical statistics and is based mainly on the analysis of correlations and decomposition into variance components. As mentioned above, the inheritance of qualitative traits can also be treated as inheritance of quantitative traits, which in some cases turns out to be a very fruitful approach. I hope we will have time to briefly review the beginnings of quantitative trait genetics. In the meantime, a little more terminology.

Two no less broad concepts than a sign, which, however, cannot be dispensed with - genotype and phenotype... These terms themselves, as well as the term " gene", introduced in 1909 by the Danish geneticist Wilhelm Ludwig Johansen. The phenotype is everything that concerns the traits of the organisms in question, the genotype is everything that concerns their genes. It is clear that there can be an infinite number of traits, and there are tens of thousands of genes. Moreover, no one registers the overwhelming majority of the signs, and no one knows the overwhelming majority of genes. But phenotype and genotype are working concepts, the content of which in each specific case is dictated by a genetic experiment. A genetic experiment usually consists in the fact that someone is crossed with someone, often for many generations, and the signs of the offspring are monitored, which can be selected, crossed, etc. in accordance with these characteristics. Or, a sample of individuals is removed from nature , register their signs, find out what variants some genes are represented by, observe the dynamics of their frequencies. In each case, we follow strictly defined traits and genes, often a few. And when we talk about the phenotype, we mean the values ​​or states of these particular traits, and when we talk about the genotype, then the set of these particular genes. There is a dependence of the first on the second, but, as we will see, it is not the most direct one. In finding out this dependence, genetics in many respects consists. And only if the DNA sequence itself appears as a trait, the phenotype coincides with the genotype.

Only recently has it become possible to conduct high-tech experiments to track all the known genes of those objects for whom they are known (for example, a human) - for example, the presence or absence of all messenger RNA or all proteins in a particular tissue. The corresponding directions were named, respectively, "proteomics" and "transcriptomics", and the totality of all proteins or messenger RNA present in one or another object, respectively - proteome and transcript.

1.3. The concepts of "gene", "locus", "allele", "ortholog", "paralog", "mutation".

Based on our preliminary statement that there is a lot of mathematics in genetics, we should have expected terminological rigor in it. Unfortunately, this is also an empirical science, existing on a huge and heterogeneous experimental material, done by many scientists of different specializations (and different education!), And this led to the existence in genetics of various terminological "dialects", including in things very important. Let's move on to a concept that may seem central to genetics, but which in reality turned out to be too vague for this. Tell me what is gene? In fact, this is a concept that was very unlucky, so now it has several meanings. In classical genetics a gene is an inherited factor that affects the characteristics of an organism... It was once considered as a further indivisible unit of heredity. After the discovery of the structure of DNA, it quickly became clear that many classical genes are sections of DNA that encode a certain protein, for example, an enzyme that determines the inherited trait. This was a huge breakthrough in science, and on this wave it initially seemed that all the genes of classical genetics are just like that. The following formula was developed: “ One gene - one polypeptide chain". It was proposed, in the original formulation "one gene - one enzyme", in 1941 (that is, 12 years before the DNA structure was deciphered by Watson and Crick) by George Beadle and Edward Tatham (you can find portraits of these and many other scientists in the textbook) who worked with strains of neurospore mold that differed in their ability to carry out certain biochemical reactions and found that each gene is responsible for one specific biochemical reaction, that is, for a certain stage of mold metabolism. For this work, they received the Nobel Prize in 1948. Note that at that stage the gene was still understood quite classically, but active research was carried out to find out what it is physically. And after the discovery of the DNA structure, everything seemed to fall into place and the genome began to be called the DNA section encoding the polypeptide chain.

However, over time, it was found that next to the coding sequence there are always regulatory DNA sequences that do not encode anything themselves, but affect the on-off and transcriptional intensity of this gene. You know them well: this promoter is the landing site for RNA polymerase, operators are the site for regulatory proteins, and enhancers- also the sites of entry of regulatory proteins that promote transcription, but located at some, sometimes significant, distance from the coding sequence, and silencers- sequences that prevent transcription, etc. Sometimes they are located over hundreds and thousands of nucleotides (on the scale of the chromosome, this is not so much), but they still function as cis-factors (i.e., located nearby), physically located nearby due to a certain DNA packing. All this economy came to be regarded as belonging to a gene that encodes something. Thus, in molecular genetics of eukaryotes a gene is a coding region of DNA along with adjacent regions of DNA that affect its transcription.

For such a site in 1957, S. Benzer proposed a qualifying term cistron, who was also unlucky, since this term began to denote only the coding region of DNA (the so-called open reading frame), and sometimes also the region of DNA between the promoter and the terminator, from which a single RNA molecule is read. You will remember that in prokaryotes, in which molecular genetic mechanisms began to be clarified earlier, the operon organization of genes is widespread, when the sequences encoding several polypeptide chains have common regulation and are read as part of a single mRNA. This does not allow the above definition of the term "gene" to be used. On the other hand, the term "cistron" is also of little use here: being defined as a piece of DNA from which a single RNA is read, it will include regions coding for several different proteins, which, on the other hand, was once called "the polycistronic principle of organizing genetic material." As a result, the use of the terms "gene" and "cistron" without explanation (at least which kingdom is being discussed) is currently fraught with misunderstandings.

Note that in the molecular biological sense, the gene turned out to be subdivided into parts - exons, introns, operators, enhancers, in the end - individual nucleotides. And the regulatory DNA sequence, taken as such, lost the right to be called a genome, since it itself does not encode anything. But due to the influence on gene transcription, this sequence can also affect some trait (i.e., the phenotype), which will be inherited along with this sequence. And it itself can be separated by recombination from the coding sequence, especially if it is a remote enhancer. In other words, the regulatory sequence is also a special hereditary factor, which, moreover, has its own place on the chromosome. Some regulatory sequences, such as enhancers, can affect the transcription of several genes at once, i.e., take their definite place in the regulatory network for controlling the development and functioning of the organism. There are all the signs of a gene in the understanding of classical genetics.

This contradiction between the classical and molecular biological concepts of a gene, which arose at a time when it seemed that all classical genes are transcribed DNA regions encoding a protein or RNA, has not yet been overcome, which, however, is not particularly important, since the word "gene" has not been used as a strict term for a long time. In connection with the rapid development of molecular biology, molecular biology wins: a gene is a transcribed piece of DNA along with its regulatory DNA sequences. However, the classical concept: a gene is a hereditary factor (no matter how it functions, what it is and what it consists of), was historically the first, lasted more than half a century and turned out to be extremely fruitful. You need to be aware of this contradiction and learn to understand what is at stake from the context.

In practice, this contradiction is resolved in two ways: either before the use of the word "gene" its meaning is preliminarily specified, or it is not used as a term. An example of the first case: in the section "materials and methods" in an article devoted to counting genes in the genome, it will certainly be written by what criterion the gene was determined - for example, the number of open reading frames was counted. In the next article they will write: we analyzed the expression and showed that some of the found potential reading frames are never transcribed and, apparently, are not genes, but pseudogenes. An example of the second situation: we are studying a locus from which several thousand proteins are made due to the fact that there are three alternative promoters, three alternative terminators, and a dozen introns subject to alternative splicing. Where is the gene and how many genes are in this locus? In this case, the word "gene" will only be mentioned in the introduction, as a synonym for the word "locus". If we take a phrase containing the word "gene" from a population genetic context and insert it into a molecular biological context, then we get a loss of meaning.

Different variants of the same gene, in any sense of it, are designated by the term alleles... In this form, the term was proposed by V. Johansen in 1926, on the basis of the term "allelomorphic pair" introduced by W. Batson in 1902). The concept of "allele" appeared when nothing was known about the structure of DNA, and it was introduced precisely as an alternative variant of the gene. This concept is especially important for diploid organisms, which receive the same set of genes from the father and mother and, as a result, each of them is present in the genome in two copies, which can be identical or different, but not to such an extent that it cannot be said that this is "One and the same gene." These two copies were named alleles.

It's funny, but with regard to the term "allele" there is no unambiguous solution and such a simple question as the grammatical gender of this word in Russian. Moscow, as well as Kiev and Novosibirsk schools, believe that the allele is masculine, Leningrad (St. Petersburg) - that it is feminine. You can see that even the two recommended textbooks use this word differently.

Originally, the term "alleles" was introduced to denote variants of a gene responsible for a particular trait that are associated with the condition of that trait. However, it turned out that the same trait can be influenced in the same way by genes independent of each other. This raises the problem of distinguishing between alleles of the same or different genes. Fortunately, it became clear even earlier that genes are located in a strictly defined sequence in linear structures - as it turned out, in chromosomes - so that each gene occupies a strictly defined place on one of the chromosomes. Therefore, each gene could be identified not only by its effect on the trait, but also by its place on a particular chromosome. It turns out that every place on the chromosome responsible for some trait is locus- is occupied by one of the alleles - individual gene variants. The diploid nucleus contains two alleles of each locus, obtained from the mother and father, different or the same. Locus can be defined as position on a chromosome held by a particular hereditary factor, a allele- how a variant of a certain hereditary factor, and since the specificity of the hereditary factor is given precisely by the locus, but the allele is a variant of a hereditary factor located at a specific locus... Obviously, this definition is given from the point of view of classical genetics. In this case, it is better to say "locus on the chromosome." and not the "locus of the chromosome", because in the second case, the impression may be created that the chromosome is composed only of such loci that have a genetic meaning. Although a gene in the classical sense does indeed correspond to a certain piece of DNA on a chromosome, and although very often nothing coding DNA regions can influence something at least indirectly (for example, the presence of a repeat block can contribute to chromatin compaction and thereby affect the intensity of transcription of coding segments DNA located even at a considerable distance from it), nevertheless, there certainly exist extended sections of DNA that do not have any genetic content, that is, they do not affect anything and are not genes in any sense.

But the terms "locus" and "allele" also have a funny expansive meaning. If we study the DNA sequence itself, which in this case is both our trait and our genome, since it literally encodes itself, we can call any part of it that can be recognized in some way as a locus, and a variant of it as an allele. For example, the genome contains the so-called "microsatellites" - sequences of very short, two or three letters, tandem (located one after the other) repeats. The number of these repeats is very easily changed due to mechanisms associated with replication slippage or improper recombination. Actually, due to these mechanisms, they are “turned on” in the genome, while they have no function of their own and they are not genes in the molecular sense. Due to their high variability, microsatellites like to study evolutionary genetics - because the number of copies of repeats can be judged with a certain degree of certainty about the relationship. So, in this case, it is also customary to talk about alleles, denoting with this word sequences of microsatellites of different lengths (that is, with a different number of repeat copies).

It turned out that the word "gene" in classical genetics can be abandoned altogether. There is a locus - a place on the chromosome, which is always occupied by one of the alleles. The relationship between a locus and an allele is the same as the relationship between a variable and its value. Moreover, in accordance with the classical definition, both a locus is a gene (as a generic concept), and an allele is a gene (as an individual concept). You can often hear "these genes are not allelic to each other," that is, they speak of allelic and non-allelic genes, that is, alleles of one locus and alleles of different loci. In the practice of genetics, a not very strict tradition has been established to use the word "gene" as a synonym for the word "locus", and such examples will also be encountered in our text.

But there are situations where the word "gene" is difficult to avoid. For example, they treated peas with red flowers with a chemical mutagen and got peas with white flowers. It was found that the trait "flower color" is inherited as determined by one locus - in such cases, it is customary to talk about monogenic attribute (although the non-existent term "monolocal" would be more accurate). However, peas with white flowers are already known and this trait is determined by an allele of a well-known locus. The question is, did we get the same allele of the same locus or a different (at the level of DNA sequence) allele of the same locus, which, however, also leads to white colors? Or an allele of a new, previously unknown locus - which may, say, for a completely different stage of pigment synthesis? Until this has been established, we have to say loosely: "We got the gene for white-flowered flowers." By the way, a real situation from the life of our laboratory is described - we received a gene that determines white-floweredness, which turned out to be allelic to an not widely known locus responsible for the anthocyanin color of the flower a, and the little-known locus a2 .

The terms locus and allele can also be applied to a gene in the molecular genetic sense - namely, to a specific sequence of nucleotides. Here the meaning of the terms "locus" and "gene" coincides, and the allele will mean a specific nucleotide sequence of a given gene... However, within the framework of molecular genetics, the need for these terms does not arise so often, since molecular biological consideration is usually abstracted from the existence in a diploid organism of a second such gene, with an identical or slightly different sequence, in a homologous chromosome.

You probably know from molecular biology about the existence multigenic families: when there are several genes in the genome in the molecular sense, encoding a protein product of the same type - the same enzyme, for example. Moreover, they may differ somewhat in the primary structure: both DNA and protein product, as well as some physicochemical properties of the protein product - the intensity of molecular function, as well as the features of expression - that is, the place, time and intensity of synthesis. The same pea has seven genes (in the molecular sense) of histone H1, each of them encodes a special variant of the molecule, one of which is present only in actively dividing cells and disappears from the chromatin of cells that have completed division. Any sequence of any of these genes will be a variant of the histone H1 gene. But within the same genome, these seven genes occupy different loci, so only different variants of a particular locus will be alleles. You should be familiar with the concept homology- similarity based on common origin, and homologues- objects with such similarities. In molecular genetics, two types of gene homology are distinguished. Homologous, but non-allelic genes in the same haploid genome, occupying different loci, are called paralogs(from the Greek "couple" - about, near). Individual variants of the same locus in different individuals are called orthologs(from the Greek "ortho" - directly, on the contrary; remember the ortho-pair isomers in organic matter). Orthologs are essentially alleles. However, the term "ortholog" is usually used by molecular biologists when they study genes of different species - in cases where it can be unambiguously established that they have the same locus, while the term "allele" is applied only to a gene variant in the same the same species, or in closely related species that, nevertheless, are capable of interbreeding (for example, wheat and its wild relatives). Thus, an allele is precisely a genetic concept; they speak of alleles when, in principle, they can participate in crossing.

Let us ask ourselves a question - where did the paralogs come from? It is logical and correct to assume that they arose as a result of gene duplication - that is, rare cases of gene “multiplication” in the genome. Naturally, any such event, no matter how rare, occurs within a single species. As a result, we have a situation when some individuals of the same species have two loci in the genome that are identical in their primary structure (over time, it can accumulate differences), while others have only one. Suppose that two copies of a multiplied gene are located side by side, so that both new loci are located in the same place as one old one. And so they start to accumulate differences. Where and what are the alleles here? We have considered the situation when the concept of "allele" fails, and this is very good, since in this way we have traced the border of its applicability.

By the way, an unexpectedly non-trivial question - what are different and identical alleles. In the early stages of the development of genetics, alleles were recognized only by phenotype, and only those that lead to different phenotypes were considered different alleles. Most often, there were two alleles - normal and defective (mutant), so that in the early stages of the development of genetics, the theory of presence-absence (of a certain function) was popular. However, with the development of genetics, more and more cases became known when the same trait has several inherited variants, which ultimately led to the emergence famous aphorism Thomas Morgan: "One presence cannot correspond to several absences." And in the case of quantitative traits determined by many genes at once, there is no special phenotypic manifestation of a single allele at all. As a result, we settled on the fact that the alleles began to be considered deliberately different, if in this experiment they are not deliberately inherited from the same individual, that is, they are not identical in origin, or such an identity has not been established. For example, we catch in nature one hundred seemingly completely identical individuals in order to study the small nuances of the phenotypic manifestation of a gene, we cross them with special test lines, we translate the studied gene obtained from them to an identical gene background, we measure the trait of interest to us - and at the same time we believe that a hundred different (by origin) normal (!) alleles participate in the experiment (all of them are obtained from nature from viable individuals).

You understand that when it became possible to decipher the primary structure of the genes under study, the question of allele identity ceased to be theoretical and was reduced to the identity of their primary structure (nucleotide sequence). If there is at least one substitution, the alleles are different, if not, they are the same, since they are completely identical molecules. Taking into account the possibility of accumulation of nucleotide substitutions, many of which do not affect the function of the locus, in practice this approach differs little from a priori considering any alleles independently obtained from different individuals to be different. However, the rate of occurrence of substitutions varies greatly from locus to locus - for example, in some loci we observed an identical nucleotide sequence even in alleles obtained from different subspecies of peas (wild and cultivated).

Let us touch on such non-strict, popular terms as "wild-type alleles", "mutant alleles" and "null-alleles". The aforementioned "presence-absence theory" is quite applicable in many cases. Let's take the same peas as an example. Pea flowers have a pigment - anthocyanin, which colors them pink-red (purple). If any of the proteins involved in the biochemical chain of anthocyanin synthesis is defective or not, anthocyanin is not synthesized and the flowers remain white. Let's say there is a locus in a certain chromosome, let's designate it a, which contains the DNA sequence that codes for one of these proteins. Usually they speak less strictly, but simpler - in a certain chromosome there is a gene a, which encodes one of these proteins (peas really have such a gene with this designation and encodes a regulatory protein that binds to DNA, and not an enzyme involved in the synthesis of anthocyanin). Let this gene have two alleles, let's designate them A and a... Allele A encodes a normal functional protein. Allele a does not encode a functional protein. How is this possible - we will talk later, it is important for us now that this allele simply “does not work” - it does not fulfill its molecular function, even if it is unknown to us. In such cases, the normal allele is called wild type/ In the example of peas, this term is doubly correct. Peas are both cultivated and wild (representatives of the same species continue to exist in the wild). And all wild peas have purple flowers, and cultural ones have both purple and white, but in vegetable and grain varieties of European selection, white predominates. For an allele unable to form a functional protein product, the term is often also used null allele.

There are times when the notion "wild type" or "null allele" is not applicable. For example, in a two-point ladybug Adalia bipunctata there are two forms - red with black specks and black with red. (By the way, this is one of the classical objects of population genetics, introduced into this science by Timofeev-Resovsky.) Both are represented in the European part of Russia, none is better than the other (in Novosibirsk, however, only the second is found). None of them can be called wild as opposed to the other. However, it is possible that one of these alleles is associated with the loss of the molecular function of the protein product of this locus, which, like other genes of individual development, is probably a factor influencing the expression of other genes.

Further there is a term popular in genetics - mutation... Historically, the concept was introduced by Hugo De Vries in a meaning that is close to that which is now used in horror films - a sudden change in hereditary inclinations, leading to a radical change in the phenotype. De Vries worked with one of the types of primrose ( Oenothera), in which, as it turned out later, an extremely original cytogenetics: due to multiple chromosomal rearrangements, the entire genome is inherited as a single allele. However, the word has become a widely used term, and not only in Hollywood. Sergei Sergeevich Chetverikov, one of the founders of population genetics, used the term "genovariation", which is more correct, but did not take root (although Chetverikov was one of the domestic geneticists who had a significant impact on world genetics, actually founding population genetics). Currently under mutation is understood any change in the primary structure of DNA- from the replacement of one nucleotide to the loss of huge parts of chromosomes. I would like to draw your attention to the fact that the word "mutation" means the very event of change. However, in a loose, but tenacious genetic practice, the same word "mutation" is often applied to its result, that is, to the allele that has arisen as a result of the mutation. They say: “The experiment involves Drosophila - carriers of the mutation white". No one registered the mutational event itself, which led to the emergence of this classic mutation - by the way, it is associated with the insertion of a mobile genetic element into the enzyme gene copia, which moves extremely rarely - but everyone keeps saying "mutation" instead of "mutant allele". It is understood that a mutation once occurred that spoiled the normal allele, resulting in a mutant one. It is easy to understand that the "mutant allele" is also the opposite of the expression "wild-type allele", but broader than the "null-allele", since it allows various deviations from the wild-type allele, as leading to a complete loss of molecular function (those same " several absences ”!), and never leading.

There is another very nasty terminological situation that some of you will have to face in human genetics. As we will see later, human genetics in general terminologically deviated quite strongly from general genetics. The reason is that, on the one hand, this specialized field of science belongs to both biology and medicine and is purely institutionally isolated from all other genetics, and in this sense it is brewed in its own juice. On the other hand, due to its practical importance, this area is very large in volume - the number of researchers and their research, journals, articles - which makes its internal traditions resistant to outside influences, including from the "mother" general genetics. Modern human genetics has advanced so far that in many cases it has realized the age-old dream of geneticists, namely, it was able to associate certain traits (including pathological ones) with the finding of certain nucleotides in specific positions of specific genes. But it was here that the annoying terminological substitution took place. When we compare many alleles in relation to the primary structure of DNA, it turns out that in some positions the same specific nucleotide is always found, and nucleotide substitutions are possible in some positions. (There is a suspicion that any nucleotide in any position can be found in the genomes of all people of humanity, which raises a funny philosophical question - what is the human genome). They were correctly named polymorphic positions- and indeed, each such position exhibits alternative variability - that is, polymorphism - with respect to which of the four nucleotides it can be occupied with. But here, somehow, a substitution of concepts took place. "Polymorphism" began to call a specific nucleotide at a specific polymorphic position (what should be called a "morph"). They began to say something like this: “We have sequenced such and such a gene in so many people and found twelve polymorphisms, two in position such and such, six in such and such, and four in such and such. Two of the polymorphisms in such-and-such position showed a significant association with such-and-such syndrome. " Most likely, such a substitution took place at the level of laboratory slang, which exists in any scientific work and consists in simplifying terminology, which is often illiterate. Students who come to the laboratory sometimes mistake slang for terminology and begin to use it in all seriousness. At some point, it happens that both the author of the article and the reviewers in a scientific journal are accustomed to the same slang, then it penetrates the scientific press and, with some probability, becomes fixed. (The picture, by the way, is more than familiar from population genetics and completely copies the process of speciation - when in an isolated population they accidentally arise, coincide in different sexes and anomalies in the recognition system of suitable sexual partners are recorded, which become the norm in a new species and lead to its non-interbreeding with the old one. .) In addition to the etymological contradiction (one single morph is called a word indicating that there are many morphs) and bad taste, such a substitution also has the consequence that researchers using this jargon have deprived themselves of the term "polymorphism" in its correct meaning. And when it becomes necessary to express the corresponding concept (which has not gone anywhere), instead of an unambiguous term, they have to resort to verbose descriptions. For example, in situations for which the term "balanced polymorphism" exists - when one of the morphs has an advantage in some conditions, the other in others, so they coexist and do not supplant one another - they have to resort to lengthy descriptions like this one every time.

In terms of introducing you to traditional and not always consistent genetic terminology, a rather funny term should be mentioned marker... This term was introduced for loci that are important to us not by themselves, but insofar as they mark a certain region of the chromosome. The appearance of such a term was associated with a long period of time when not very many genetic loci were known. It was needed in situations where it was necessary to put a newly discovered gene at stake, or, paradoxically, to work with not yet discovered genes. For example, the nature of genes that control economically valuable quantitative traits of plants and animals was completely unknown for a long time, and even now little is known about them. At the same time, there was no doubt that these genes exist and are located on chromosomes. By manipulating known loci - markers - it was possible to identify regions of chromosomes with which certain effects on quantitative traits are associated, and use them in breeding work. Initially, these were mainly "visible markers" - loci in which there were alleles with a visible effect. However, in the future, this approach was seriously developed due to the involvement in genetic analysis of biochemical traits (usually also not functionally related to economically valuable traits), and later due to the emergence of the opportunity to work with polymorphism of the DNA of chromosomes itself. This led to the emergence of the concept of "molecular marker". Thus, the term "marker" is just a synonym for the term "locus", but emphasizes that this locus is of interest to us not as such, but only as a landmark on the chromosome. However, they got used to the term so much that it began to be used in those cases when the locus is a directly studied object. Paradoxically, in studies of molecular phylogeny, the analyzed sequences themselves are also commonly referred to as markers. Here it could be implied that they are only landmarks in time and nucleotide substitutions in them mark evolutionary events, which, of course, are not limited to changes only in the analyzed sequences.

Genes (more precisely, loci) are usually denoted by abbreviations consisting of Latin letters, as well as numbers. However, behind these designations are the full names of the genes, Latin or, more often, English. Both full names and gene abbreviations are always written in italics. For genes with visible expression, this is usually the word describing the mutant phenotype: wwhite(white eyes of a fly), yyellow(yellow body in a fly), aanthocyanin inhibition(near peas), opovula pistilloida(near peas), bthbithorax- not a very good name for the Drosophila mutation, in which a second pair of wings appears on the metathorax (metathorax) (as on the mesothorax) - but it is written as if the thoracic tagma had doubled. There is even a mutation of Drosophila with the official name fushi tarazu(shorthand symbol - ftz) - Japanese. Funny Americans named one of the genes mothers against decapentaplegic, by analogy with organizations like "mothers against the war in Iraq" - female Drosophila carriers of this mutation will not survive the descendants of the gene decapentaplegic... The abbreviation for this gene sounds no worse: mad... Occasionally, and not in the most popular objects, the official name of the gene and its abbreviation are not related to each other: a mutation that turns the antennae of peas into leaves has the designation tl(from tendrilless), and the name is clavicula... If a gene is known by its molecular product (protein or RNA), then this gene itself will be named by its product: mtTrnKmitochondrial transportation RNA for lysine, RbcLribulose biphosphate carboxylase large subunit... It is important that each species has a completely independent official nomenclature of genetic symbols, which leads to some difficulties at the present time, when the number of objects with developed private genetics has increased, and the number of objects in which genes are studied not by genetic experiments, but by direct reading of DNA sequences - it grows like an avalanche (for example, the project "10,000 vertebrate genomes" is already underway).

Genetics began with cases when only two alleles were known at each locus and it was possible to distinguish them by writing with a capital or small letter, the beginning of which was laid by Mendel. The capital letter was used for the dominant allele (you know what this means from school, later we will touch on the phenomenon of dominance in more detail) - as a rule, this is a wild-type allele; as we would say now - an allele with normal, not impaired molecular function. At the same time, the locus was designated with a small letter, that is, its designation coincided with that of the recessive, that is, mutant, non-functional allele, because it was by the existence of such an allele that scientists first learned about the existence of a locus. In rare cases, when a mutant allele turned out to be dominant, both it and the locus itself were designated with a capital letter.

When, and very soon, it became clear that there are many alleles in a locus (now we know that there are a lot of them), the designations of alleles were introduced, which are written in a superscript after the designation of the loci. The “+” symbol is often used as such an index for the wild-type allele, sometimes the index is absent. Let's say at the very first known Drosophila locus white (w) the wild-type allele is denoted w+ , the allele responsible for white eyes - w, and responsible for apricot - wa (full name - whiteapricot).

I would like to draw your attention to the fact that for traditional genetic objects with developed private genetics, different traditions still coexist in writing the designations of loci and their alleles. So far I have found three of them:

Loci with visible manifestation are written with a small or with a capital letter, depending on whether the locus is described according to the recessive or dominant allele with respect to the wild type; and with a capital letter if the locus is known by molecular function. At the same time, for loci with visible manifestation and dominance, the tradition is preserved to write recessive alleles with a small letter, and dominant ones with a capital letter. This is the genetic nomenclature, for example, in peas and mice. For example, the locus of the pea a, responsible for the color of flowers has alleles A and a.

As in the previous case, but the capital and small letters in the designation of the locus and its alleles are rigidly fixed. This system is used in Drosophila. Here the notation w and W belong to completely different loci - white and Wrinkled... The wild-type allele is always indicated here by the suffix "+". (It is curious that geneticists dealing with fruit flies and mice, accustomed to the system adopted in their objects, usually do not even suspect the existence of another system of naming loci.)

All letters in the designation of loci are always large. Such a system is now used in human genetics, and it has been adopted quite recently.

The same allele designations are used to designate phenotypes, but always without italics. So, if you describe the results of an experiment in which you observed so many pea plants with purple flowers and so many with white flowers, and you know that white-floweredness in the experiment is associated with the locus a, then you will designate purple-flowered and white-flowered plants with the letters A and a in the frequency table, even if you do not know their genotype. The same is done if you determine the presence of electrophoretic variants of some isoenzyme: there the correspondence of the phenotype to the genotype is greater, but it is not always unambiguous.

1.4. The concepts of "homozygote", "heterozygote", "hemizygote".

In every diploid organism, each chromosome (except for sex) is represented in two copies - homologues received from the father and mother, respectively. Each of the homologues has the same set of loci, and in each of the homologs, each locus is occupied by some allele. Consequently, each diploid organism carries two alleles of each locus. When recording its genotype, the designations of two alleles present in the locus (loci) of interest to us are written in succession, for example, if there is a have pea alleles A and a three genotypes are possible: A A, A a and a a.

If in both homologues the locus is represented by the same allele, then it is said that the individual homozygous at this allele, or at this locus. Moreover, when they say that it is homozygous for a locus, the emphasis is on the fact that in both homologues there are no differences in it, when they say that it is homozygous for an allele, the emphasis is on which allele. If in both homologues the locus is represented by different alleles, then the individual heterozygous at this locus. For simplicity, homozygous and heterozygous individuals are called, respectively. homozygote and heterozygote... Considering what was said above about the identity / differences of alleles, true homozygotes in nature are not very common. However, in a particular experiment, no one bothers to distract from the differences that are not detected in this experiment or cannot be identified, and to consider individuals as homozygotes in which both copies of the locus have an identical phenotypic manifestation. In studies in which related individuals are involved, there are notorious homozygotes - individuals in which both alleles of a locus are identical in origin. In such studies, the concept is often used average heterozygosity- the proportion of heterozygous loci among all loci.

Let's add another term hemizygote- This is an individual in which not two, but only one allele is present. Well, for example, you probably know that men only have one sex X chromosome, and the second sex chromosome, the Y-chromosome, is not homologous to it (except for small areas), since it is not devoid of most of the regions saturated with genetic information. Therefore, alleles from those regions of the X chromosome that are not represented in the Y chromosome do not have homologues in the nucleus, that is, they are in the hemizygote. Sometimes a chromosome loses some of its fragment along with the genes in it (or one gene). In this case, the alleles of these genes in the homologous chromosome also appear in the hemizygote. However, in a genetic experiment, we often do not know what happened in the chromosomes, and we judge genes only by their phenotype. In this case, the absence of a gene may not differ from its "breakdown" - the loss of its function. And while we do not know, shall we say, the molecular background, but somehow we conclude that the molecular function is lost, we will only talk about an allele, or "null allele".

Distinguishing between homozygote, heterozygote and hemizygote is important in diploid organisms, because dose the corresponding allele in the genome in this case differs by half (for example, in the case of a locus in the X chromosome, two copies per genome in women and one in men), which may be important. Molecular genetics is usually distracted from the homozygosity / heterozygosity of its objects. However, the concept is often used here gene doses, that is, the number of alleles with intact molecular function in the genome - usually it varies from 0 to 2, but can be increased by gene modification, that is, artificially introducing additional copies into the genome.

In the case of haploid organisms, it is customary to say that they generally have all alleles of all genes in the hemizygote. What haploid organisms do we have? Prokaryotes, lower fungi and ascomycetes, gametophytes of plants. Let's note one detail - haploids are not those who have strictly one haploid genome in a cell. Most bacterial cells contain several nucleoids that have not yet managed to divide - but they are all identical (up to de novo mutations). In lower fungi, hyphae are often not subdivided into individual cells at all. It is important that the haploid organism has in its cells the only variant of the haploid genome. Finally, some animals - like the Hymenoptera - have a haploid sex - you probably know that bee drones are haploid. At the same time, in somatic cells, the set of chromosomes doubles, from which they do not cease to be haploids. Mitochondria and plastids are often inherited only from the mother; therefore, the cells are hemizygous for the genes found in the genomes of these organelles. However, in many plants plastids sometimes have biparental inheritance, in others this happens occasionally, and it is extremely rare for paternal mitochondria to penetrate into the zygote. In such cases, the offspring receives from both parents a certain varying proportion of these organelles, not necessarily equal to 1/2. In such cases, it is customary to talk about heteroplasmy.

05.05.2015 13.10.2015

In modern genetic science, the terms are widely used - alleles, loci, markers. Meanwhile, the fate of the child often depends on the understanding of such narrow terms, because the diagnosis of paternity is directly related to these concepts.

Human genetic trait

Anyone has their own unique set of genes that they receive from their parents. As a result of the combination of the totality of parental genes, a completely new, unique organism of the child with its own set of genes is obtained.
In genetic science, modern researchers for diagnostics have identified certain areas of human genes that have the greatest variability - loci (their second name is DNA markers).
Any of these loci has many genetic variations - alleles (allelic variants), the composition of which is purely unique and purely individual for each person. For example, a hair color locus has two possible alleles - dark or light. Each marker has its own individual number of alleles. Some markers contain 7-8, others more than 20. The combination of alleles at all studied loci is called the DNA profile of a particular person.
It is the variability of these sections of genes that makes it possible to conduct a genetic examination of the relationship between people, because a child from his parents receives one of the loci from each parent.

The principle of genetic testing

The genetic procedure for establishing biological paternity helps to establish whether the man who considers himself the parent of a certain child is a real dad or this fact is excluded. For the examination of biological paternity, the analysis compares the loci between the parents and their child.
Modern methods of DNA analysis are capable of simultaneously studying the human genome at several loci at once. For example, the examination of 16 markers at once is included in a standardized gene study. But today, in modern laboratories, expert research is done on almost 40 loci.
Analyzes are carried out using modern gene analyzers - sequencers. At the exit, the researcher receives an electrophoregram, which indicates the loci and alleles of the analyzed sample. Thus, as a result of DNA analysis, the presence of certain alleles in the analyzed DNA sample is analyzed.

Determining the likelihood of relationship

To determine the level of kinship, the DNA profiles that were obtained for a specific participant in the examination are subjected to statistical processing, based on the results of which the expert concludes on the percentage probability of kinship.
In order to calculate the level of affinity, a certain statistical program compares the presence of the same allelic variants of all studied loci from the analyzed ones. The calculation is carried out between all those participating in the analysis. The result of the calculation is the determination of the combined paternity index. The second indicator is the likelihood of paternity. The high value of each of the determined values ​​is evidence of the biological paternity of the examined man. As a rule, a database of allelic frequencies obtained for the population of Russia is used to calculate the indicators of relationship.
A positive result of comparing 16 different, randomly selected DNA markers allows, according to statistics, to determine the likelihood of paternity. However, if the results for alleles of 3 or more markers out of 16 do not match, the result of the examination of biological paternity is considered negative.

Accuracy of examination results

Several factors affect the accuracy of the results of a gene examination:
the number of analyzed genetic loci;
the nature of the locus.
Genetic analysis of as many loci as possible unique to a particular person allows you to more accurately establish (or, conversely, refute) the degree of probability of paternity.
Thus, the achieved degree of probability with simultaneous analysis of up to 40 different loci is up to 99.9% to confirm the probability of biological paternity, and also up to 100% if a negative result is obtained.
Determination of biological paternity with a degree of 100% probability is impossible due to the theoretical possibility of the existence of a man with the same set of DNA markers as the child's father. However, at a probability level of 99.9%, the examination is considered positive, and paternity is proven.

What DNA sources are suitable for analysis?

DNA examination is a highly sensitive procedure that does not require large quantities sample for DNA extraction. Thanks to modern scientific advances, genetic examination to determine the likelihood of paternity can be carried out using both biological material obtained from a particular person (swab from the mouth, hair, blood) and non-biological material, that is, only just in contact with a person (for example, his toothbrush , article of clothing, baby pacifier, kitchen utensils). This is possible due to the fact that in all human cells, regardless of their origin, the DNA molecules are exactly the same, which makes it possible to compare DNA samples obtained from the patient's mouth with a sample from blood, or from a DNA sample obtained from a toothbrush or clothing.

New advances in determining paternity

The development of microchip diagnostics has become a new word in the definition of paternity. Due to the indication on a microchip (a small plate) of almost all human genes, determining paternity will not be difficult. This technology is like a genetic passport. Taking a sample of blood or amniotic fluid from the fetus, it will be possible to easily isolate DNA from it and hybridize it to the microchips of the parents. Researchers plan to use this technology to identify hereditary diseases as well.


Gene- structural and functional unit of heredity of living organisms. A gene is a piece of DNA that specifies the sequence of a particular polypeptide or functional RNA.

Peptides- a family of substances, the molecules of which are built from two or more amino acid residues connected in a chain by peptide (amide) bonds -C (O) NH-. Usually peptides consisting of amino acids are meant. Peptides shorter than about 10-20 amino acid residues may also be referred to as oligopeptides, for a longer sequence length they are called polypeptides.

Proteins usually referred to as polypeptides containing from about 50 amino acid residues.

Genome- a set of hereditary material contained in the cell of the body. The genome contains the biological information needed to build and maintain an organism. Most genomes, including the human genome and the genomes of all other cellular life forms, are built from DNA, but some viruses have genomes from RNA. In humans (Homo sapiens), the genome consists of 23 pairs of chromosomes located in the nucleus, as well as mitochondrial DNA. Twenty-two autosomes, two sex chromosomes X and Y, and human mitochondrial DNA contain together approximately 3.1 billion base pairs.

Together with environmental factors, the genome determines phenotype organism.

Genotype- a set of genes of a given organism, which characterizes an individual. The term "genotype", along with the terms "gene" and "phenotype", was introduced by the geneticist VL Johansen in 1909 in his work "Elements of the exact doctrine of heredity". Usually, a genotype is spoken of in the context of a particular gene; in polyploid individuals, it denotes a combination of alleles of a given gene. Most genes are manifested in the phenotype of the organism, but the phenotype and genotype are different in the following indicators:

  1. According to the source of information (the genotype is determined by studying the DNA of an individual, the phenotype is recorded by observing the appearance of the organism)
  2. The genotype does not always correspond to the same phenotype. Some genes appear in the phenotype only under certain conditions. On the other hand, some phenotypes, for example, the color of the coat of animals, are the result of the interaction of several genes according to the type of complementarity

Alleles- different forms of the same gene located in the same regions (loci) of homologous chromosomes and determining alternative variants of the development of the same trait. In a diploid organism, there can be two identical alleles of the same gene, in this case, the organism is called homozygous, or two different ones, which leads to a heterozygous organism. The term "allele" was also proposed by V. Johansen (1909).

Locus- in genetics, it means the location of a particular gene on the genetic or cytological map of a chromosome. A variant of the DNA sequence at a given locus is called an allele. An ordered list of loci for any genome is called genetic map.

Gene mapping is the definition of a locus for a specific biological trait.

Chromosomes- nucleoprotein structures in the nucleus of a eukaryotic cell, in which most of the hereditary information is concentrated and which are intended for its storage, implementation and transmission. Chromosomes are clearly distinguishable under a light microscope only during the period of mitotic or meiotic cell division. The set of all chromosomes of a cell, called a karyotype, is a species-specific trait, which is characterized by a relatively low level of individual variability.

Initially, the term was proposed to refer to structures found in eukaryotic cells, but in recent decades, more and more people speak of bacterial or viral chromosomes. Therefore, a broader definition is the definition of a chromosome as a structure that contains nucleic acid and whose function is to store, implement and transfer hereditary information. Eukaryotic chromosomes are DNA-containing structures in the nucleus, mitochondria, and plastids. Chromosomes of prokaryotes are DNA-containing structures in a cell without a nucleus.

Virus chromosomes is a DNA or RNA molecule in a capsid.

Locus (from Latin locus - place)

chromosomes, a linear region of a chromosome occupied by one gene. With the help of genetic and cytological methods, it is possible to determine the localization of a gene, that is, to establish in which chromosome a given gene is located, as well as the position of its L. in relation to L. other genes lying in the same chromosome (see Genetic maps chromosomes). As shown by some microorganisms, genes that control a specific sequence of biochemical reactions are found in neighboring lakes, and lakes are located in the same order in which biosynthesis reactions take place; this rule has not been established for higher organisms. The term "L." in the genetic literature, it is sometimes used as a synonym for the terms Gene and Cystrone.


Great Soviet Encyclopedia. - M .: Soviet encyclopedia. 1969-1978 .

See what "Locus" is in other dictionaries:

    Locus (s)- * locus (s) * locus (es) 1. Location of a particular gene (its specific alleles) on the chromosome or within a segment of genomic DNA. 2. The location of a given mutation or gene on the genetic map. Often used instead of the terms "mutation" ... ... Genetics. encyclopedic Dictionary

    - (lat. locus) the place of localization of a certain gene on the genetic map of the chromosome ... Big Encyclopedic Dictionary

    - (from Lat. locus place), the location of a particular gene (its alleles) on genetic. or cytological. chromosome map. Sometimes the term "L." is unjustifiably used synonymously with the term “gene”. . (Source: "Biological Encyclopedic Dictionary." Ch. ... ... Biological encyclopedic dictionary

    A, m. (... Dictionary of foreign words of the Russian language

    LOCUS- (from Lat. locus place), the location of this gene in the chromosome. Ecological encyclopedic dictionary. Chisinau: Main editorial office of the Moldavian Soviet encyclopedia... I.I. Grandpa. 1989 ... Ecological Dictionary

    Locus- the location of a certain gene (its alleles) on the chromosome ... Source: METHODOLOGICAL RECOMMENDATIONS FORECASTING, EARLY PRECLINICAL DIAGNOSTICS AND PREVENTION OF INSULIN-DEPENDENT DIABETES MELLITUS (N 15) (approved by the Chairman of the Committee ... ... Official terminology

    Noun., Number of synonyms: 1st place (170) ASIS Synonym Dictionary. V.N. Trishin. 2013 ... Synonym dictionary

    locus- Location of the allelic gene in the chromosome Topics of biotechnology EN locus ... Technical translator's guide

    This term has other meanings, see Locus (meanings). A schematic representation of a chromosome: (1) Chromatid, one of two identical parts of a chromosome after the S phase. (2) Centromere, the place in which chromatids with ... Wikipedia

    - (lat. locus), the place of localization of a certain gene on the genetic map of the chromosome. * * * LOCUS LOCUS (lat. Locus), the place of localization of a certain gene on the genetic map of the chromosome ... encyclopedic Dictionary

    Locus locus. The location of a gene (or its specific alleles) on the chromosome map of an organism; often the term "L." inappropriately used in place of the term "gene" ... (Source: "The English Russian Explanatory Dictionary of Genetic Terms." Arefiev V ... Molecular biology and genetics. Explanatory dictionary.

Books

  • Locus of control of juvenile offenders, Elena Smoleva. The paper discusses in detail the issues of diagnostics and correction of the locus of control (level of subjective control) of minors. Particular attention is paid to empirical studies of the level ...