ABSTRACT
Developing therapeutics for infectious diseases requires
understanding the main processes driving host and pathogen through which
molecular interactions influence cellular functions. The outcome of those
infectious diseases, including influenza A (IAV) depends greatly on how the
host responds to the virus and how the virus manipulates the host, which is
facilitated by protein-protein functional inter-actions and analyzing infection
associated genes at the systems level, which may enable us to characterize
specific molecular mechanisms which allow the virus of influenza A strains H1N1
and H3N2 to persist and survive inside the host. The system level analysis
based on experimental and computational approaches was used to predict human
protein-protein functional inter-actions. This human protein-protein functional
interaction is a graph consisting of nodes which are proteins, and links
joining them. Using this graph, we analyse topological properties of this human
protein-protein functional interactions, identify candidate proteins using
centrality measures and a map set of IAV infection associated proteins to
elucidate genes related to IAV infection and identify essential dense
sub-graphs underlying IAV infection outcome. We performed functional closeness
and enrichment analyses to identify statistically and biologically significant
processes and pathways implicated in IAV infection. These IAV infection
associated proteins have shown to be relevant for further research towards new
drugs and vaccine development. This study enhances our understanding on the
interplay between influenza A and its host and may contribute to the process of
designing novel drugs.
CHAPTER ONE
INTRODUCTION
This thesis presents a system level
analysis for the identification of potential targets of Influenza A disease. In
this chapter, the background of the study is presented, providing an overview
of the whole work. We formulate the problem and discuss approaches that will be
used to tackle it, highlighting advantages of these approaches. In summary,
this chapter provides a global view of the other chapters.
Background to the Study
Influenza A is a viral disease that
can be found in humans and other mammals. According to Smith, (2009), a new
influenza A virus which originated from swine surfaced in Mexico and United
States in March and early April in the year 2009. In Smith et al. (2009), it is
shown that influenza A was derived from several viruses circulating in swine,
which lead to the first transmission to humans. The virus had the potential to
become first influenza pandemic of the 21st century observed in the year 2009
(Smith, 2009). During the outbreak of the influenza, the first week of
surveillance revealed the spread of the virus in over 30 countries through
transmission from human to human. This lead the World Health Organization to
increase the pandemic alert to level 5 of 6. The different pandemic levels, as
described by the World Health Organization (WHO) in the year 2009 are shown in
Table 1.
The results from the research by Smith, (2009) stated that
there was a need for systematic surveillance of the swine influenza and
provided evidence that the mixing of new genetic elements in swine has the ability
to cause the emergence of viruses with pandemic potential in humans. In
medicine, the detection of diseases, the treatment and prevention of many
diseases have improved tremendously. This is, in part, due to an improved
understanding of biological systems and different factors that trigger
progression to diseases.
This has been influenced by advances in high-throughput
biological experiments able to generate genome scale datasets of biological
cells, including protein sequences, protein-protein interactions, gene
expression, regulation and other functional datasets. This has enabled a
paradigm shift from single gene analysis to systems level analysis, providing a
global view of systems' behavior. This requires the use of systematic
mathematical models in order to deal with this large volume of datasets for
effective biological knowledge discovery. The system level analysis based on
protein-protein interactions has been vital to understanding how proteins
function within the cell (Deng and Sheng, 2007). Understanding protein
interactions in a given cellular proteome, sometimes known as the interactome,
is important to the analysis of the cell biochemistry (Deng and Sheng, 2007). A
comprehensive collection of information related to human proteins, their
features, and their functions is required to ensure information retrieval and
possible biological knowledge discovery. For an effective biological knowledge
discovery, there is a need to better understand functional activities of
proteins in cells and the exact sub-cellular localization of proteins and their
tissue-specific distribution. In addition, the knowledge on proteins encoding
disease-associated genes play their roles in molecular complexes and biological
pathways is very important (Deng and Sheng, 2007). Some facts about influenza A
documented by the Center for Disease Control (CDC, 2009) are as follows:
Influenza A is a respiratory disease of pigs caused by type A
influenza virus that regularly causes outbreaks of influenza in pigs. Swine
influenza viruses may circulate among swine throughout the year, but most
outbreaks occur during the late fall and winter months similar to outbreaks in humans.
The classical influenza type A H1N1 virus was first isolated
from a pig in 1930. Over the years, different variations of swine flu viruses
have emerged. At this time, there are four main Influenza A virus subtypes that
have been isolated in pigs: H1N1, H1N2, H3N2, and H3N1. However, most of the
recently isolated influenza viruses from pigs have been H1N1 viruses.
Influenza A viruses do not normally infect humans. However,
sporadic human infections with swine flu have occurred. Most commonly, these
cases occur in persons with direct exposure to pigs, for example, children near
pigs at a fair or workers in the swine industry. In the past, CDC received
reports of approximately one human swine influenza virus infection every one to
two years in the U.S., but from December 2005 through February 2009, 12 cases
of human infection with swine influenza was reported. The symptoms of Influenza
A flu in people are expected to be similar to the symptoms of regular human
seasonal influenza and include fever, lethargy, lack of appetite and coughing.
Some people with swine flu also have reported runny nose, sore throat, nausea,
vomiting and diarrhoea.
Influenza viruses can be directly transmitted from pigs to
humans and from humans to pigs. Human infection with flu viruses from pigs is
most likely to occur when people are in close proximity to infected pigs, such
as in pig barns and livestock exhibits housing pigs at fairs. Human-to-human
transmission of swine flu can also occur. This is thought to occur in the same
way as seasonal flu occurs in humans, which is mainly person-to-person
transmission through coughing or sneezing of people infected with the influenza
virus. Humans may become infected by touching something with flu viruses on it
and then touching their mouth or nose. The H1N1 swine flu viruses are
antigenically very different from human H1N1 viruses and, therefore, vaccines
for human seasonal flu would not provide protection from H1N1 swine flu
viruses. In this project, we use systems level computational approaches to
identify potential targets of influenza A H1N1 and H3N2. We used a graph-based
model to elucidate relationships between different targets identified.
Moreover, we performed biological and pathways enrichment analyses. These will
produce sub network enriched process and pathways that may play initial role in
influenza pathogenesis. In this study, different datasets are derived from
different sources to build the protein-protein functional network and perform
further analyses. For generating different scores, among the many methods is
the application of Information theory, which is a branch of applied
mathematics, electrical engineering and computer science and involves the
quantification of information. According to Rieke and Warland (1997),
information theory was developed by Claude E. Shannon.
In information theory, a candidate measure is entropy, which
quantifies the uncertainty, which is involved in predicting the value of a
random variable. This measure is used at a point in scoring of sequence data.
Information theory is based on probability theory and statistics. According to
Reza (1961), entropy is an important quantity of information and it is common
to have a measure between two random variables. A property of entropy is that;
it maximizes with a uniform distribution. The entropy H of a random variable Y
is associated with measuring intuitively the amount of uncertainty of Y when
only the distribution of Y is known Reza (1961). In the same way, entropy is
used in this research work to maximize the information content of this work.
Other methods used involve an application of Network theory.
In network science and computer science, network theory is the study of graphs
as a representation of asymmetric relationships between discrete objects in a
general sense. Network theory is also part of Graph theory Newman (2003a).
Network theory can be applied in many different areas of study. Some of these
areas are statistical physics, computer science, electrical engineering,
operations research, gene regulatory networks, and so on. The first true proof
in network theory is Euler's solution of the Seven Bridges of Königsberg
problem Newman (2003a) which was to devise a walk through the city that by
crossing each bridge once, and the starting and ending points of the walk do
not need to be the same Newman (2003). There are different types of networks
that can be analysed. The social network for instance examines the structure of
relationships between social bodies or entities. Persons, groups,
organizations, Nation states, websites can be considered as entities in this
case. Social network analysis has over the years played a major role in social
science. It has been used to analyse several phenomena, including the spread of
diseases, the study of markets and many others Wasserman (1994).
In Biological network, the analysis of molecular networks has
become central; this is due to the public availability of high throughput
biological data, especially protein-protein interaction and other functional
datasets. The type of analysis here is almost the same as that of social
network analysis, but it focuses on the local patterns in the network. The
analysis of biological networks in relation to diseases led to the development
of network medicine as another area of application Barabási and Gulbache
(2011). Centrality measures which are mostly used in network theory are used in
this study to analyze a human-human protein network which is generated.
The knowledge of clustering is also required in this research
work. Cluster analysis or clustering consists of grouping objects such that
objects in the same group (cluster) are more similar to each other than objects
in another group or cluster Bailey (1994). The use of clustering is common in
data mining, statistical data analysis, bioinformatics, pattern recognition,
image analysis and so on Bailey (1994). Clustering has no specific algorithm
that can be used. Clustering can be done by different algorithms both in notion
and in how to efficiently find clusters. Some of the algorithms are, the
agglomerative algorithm that merge similar nodes recursively, the and divisive
algorithm which detects inter-community links and remove them from the network.
These methods do not produce a unique partitioning of the data set; they
however produce a hierarchy from which the user still needs to choose
appropriate clusters. They are not very robust towards outliers, which will
either show up as additional clusters or even cause other clusters to merge,
hence they are too slow for large datasets. There is however another algorithm
introduced by Blondel and Guillaume (2008) which is what we use in this work,
reasons being that it is fast and can produce quick results for a large network
unlike the agglomerative, and divisive algorithms.
For more Mathematics & Statistics Projects Click here
===================================================================
Item Type: Ghanaian Topic | Size: 141 pages | Chapters: 1-5
Format: MS Word | Delivery: Within 30Mins.
===================================================================
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.