Makefiles

Makefiles

by example


Compiling your source code files can be tedious, specially when you want to include several source files and have to type the compiling command everytime you want to do it.
Well, I have news for you… Your days of command line compiling are (mostly) over, because YOU will learn how to write Makefiles.
Makefiles are special format files that together with the make utility will help you to automagically build and manage your projects.
For this session you will need these files:

I recommend creating a new directory and placing all the files in there.note: I use g++ for compiling. You are free to change it to a compiler of your choice

The make utility

If you run

make

this program will look for a file named makefile in your directory, and then execute it.
If you have several makefiles, then you can execute them with the command:

make -f MyMakefile

There are several other switches to the make utility. For more info, man make.

Build Process

  1. Compiler takes the source files and outputs object files
  2. Linker takes the object files and creates an executable

Compiling by hand

The trivial way to compile the files and obtain an executable, is by running the command:

g++ main.cpp hello.cpp factorial.cpp -o hello

The basic Makefile

The basic makefile is composed of:

 target: dependencies [tab] system command 

This syntax applied to our example would look like:

all:
	g++ main.cpp hello.cpp factorial.cpp -o hello

[Download here]To run this makefile on your files, type:

make -f Makefile-1

On this first example we see that our target is called all. This is the default target for makefiles. The make utility will execute this target if no other one is specified.
We also see that there are no dependencies for target all, so make safely executes the system commands specified.
Finally, make compiles the program according to the command line we gave it.

Using dependencies

Sometimes is useful to use different targets. This is because if you modify a single file in your project, you don’t have to recompile everything, only what you modified.
Here is an example:

all: hello

hello: main.o factorial.o hello.o
	g++ main.o factorial.o hello.o -o hello

main.o: main.cpp
	g++ -c main.cpp

factorial.o: factorial.cpp
	g++ -c factorial.cpp

hello.o: hello.cpp
	g++ -c hello.cpp

clean:
	rm -rf *o hello

[Download here]Now we see that the target all has only dependencies, but no system commands. In order for make to execute correctly, it has to meet all the dependencies of the called target (in this case all).
Each of the dependencies are searched through all the targets available and executed if found.
In this example we see a target called clean. It is useful to have such target if you want to have a fast way to get rid of all the object files and executables.

Using variables and comments

You can also use variables when writing Makefiles. It comes in handy in situations where you want to change the compiler, or the compiler options.

# I am a comment, and I want to say that the variable CC will be
# the compiler to use.
CC=g++
# Hey!, I am comment number 2. I want to say that CFLAGS will be the
# options I'll pass to the compiler.
CFLAGS=-c -Wall

all: hello

hello: main.o factorial.o hello.o
	$(CC) main.o factorial.o hello.o -o hello

main.o: main.cpp
	$(CC) $(CFLAGS) main.cpp

factorial.o: factorial.cpp
	$(CC) $(CFLAGS) factorial.cpp

hello.o: hello.cpp
	$(CC) $(CFLAGS) hello.cpp

clean:
	rm -rf *o hello

[Download here]As you can see, variables can be very useful sometimes. To use them, just assign a value to a variable before you start to write your targets. After that, you can just use them with the dereference operator $(VAR).

Where to go from here

With this brief introduction to Makefiles, you can create some very sophisticated mechanism for compiling your projects. However, this is just a tip of the iceberg. I don’t expect anyone to fully understand the example presented below without having consulted some Make documentation (which I had to do myself) or read pages 347 to 354 of your Unix book. 

CC=g++
CFLAGS=-c -Wall
LDFLAGS=
SOURCES=main.cpp hello.cpp factorial.cpp
OBJECTS=$(SOURCES:.cpp=.o)
EXECUTABLE=hello

all: $(SOURCES) $(EXECUTABLE)

$(EXECUTABLE): $(OBJECTS) 
	$(CC) $(LDFLAGS) $(OBJECTS) -o $@

.cpp.o:
	$(CC) $(CFLAGS) $< -o $@

[Download here]If you understand this last example, you could adapt it to your own personal projects changing only 2 lines, no matter how many additional files you have !!!.

 


Hector Urtubia

Posted in 网络转载 | Tagged , , | Leave a comment

Generating function

Generating function

From Wikipedia
This article is about generating functions in mathematics. For generating functions in classical mechanics, see Generating function (physics). For signalling molecule, see Epidermal growth factor.

In mathematics, a generating function is a formal power series in one indeterminate, whose coefficients encode information about a sequence of numbers an that is indexed by the natural numbers. Generating functions were first introduced by Abraham de Moivre in 1730, in order to solve the general linear recurrence problem.[1] One can generalize to formal power series in more than one indeterminate, to encode information about arrays of numbers indexed by several natural numbers.

There are various types of generating functions, including ordinary generating functionsexponential generating functionsLambert seriesBell series, and Dirichlet series; definitions and examples are given below. Every sequence in principle has a generating function of each type (except that Lambert and Dirichlet series require indices to start at 1 rather than 0), but the ease with which they can be handled may differ considerably. The particular generating function, if any, that is most useful in a given context will depend upon the nature of the sequence and the details of the problem being addressed.

Generating functions are often expressed in closed form (rather than as a series), by some expression involving operations defined for formal power series. These expressions in terms of the indeterminate x may involve arithmetic operations, differentiation with respect to x and composition with (i.e., substitution into) other generating functions; since these operations are also defined for functions, the result looks like a function of x. Indeed, the closed form expression can often be interpreted as a function that can be evaluated at (sufficiently small) concrete values of x, and which has the formal power series as its Taylor series; this explains the designation “generating functions”. However such interpretation is not required to be possible, because formal power series are not required to give a convergent series when a nonzero numeric value is substituted for x. Also, not all expressions that are meaningful as functions of x are meaningful as expressions designating formal power series; negative and fractional powers of x are examples of this.

Generating functions are not functions in the formal sense of a mapping from a domain to a codomain; the name is merely traditional, and they are sometimes more correctly called generating series.[2]

Contents

[hide]

[edit]Definitions

A generating function is a clothesline on which we hang up a sequence of numbers for display.
Herbert WilfGeneratingfunctionology (1994)

[edit]Ordinary generating function

The ordinary generating function of a sequence an is

G(a_n;x)=\sum_{n=0}^\infty a_nx^n.

When the term generating function is used without qualification, it is usually taken to mean an ordinary generating function.

If an is the probability mass function of a discrete random variable, then its ordinary generating function is called a probability-generating function.

The ordinary generating function can be generalized to arrays with multiple indices. For example, the ordinary generating function of a two-dimensional array am, n (where n and m are natural numbers) is

G(a_{m,n};x,y)=\sum_{m,n=0}^\infty a_{m,n}x^my^n.

[edit]Exponential generating function

The exponential generating function of a sequence an is

\operatorname{EG}(a_n;x)=\sum _{n=0}^{\infty} a_n \frac{x^n}{n!}.

[edit]Poisson generating function

The Poisson generating function of a sequence an is

\operatorname{PG}(a_n;x)=\sum _{n=0}^{\infty} a_n e^{-x} \frac{x^n}{n!} = e^{-x}\, \operatorname{EG}(a_n;x)\,.

[edit]Lambert series

The Lambert series of a sequence an is

\operatorname{LG}(a_n;x)=\sum _{n=1}^{\infty} a_n \frac{x^n}{1-x^n}.

Note that in a Lambert series the index n starts at 1, not at 0.

[edit]Bell series

The Bell series of a sequence an is an expression in terms of both an indeterminate x and a prime p and is given by[3]

\operatorname{BG}_p(a_n;x)=\sum_{n=0}^\infty a_{p^n}x^n.

[edit]Dirichlet series generating functions

Formal Dirichlet series are often classified as generating functions, although they are not strictly formal power series. The Dirichlet series generating function of a sequence an is[4]

\operatorname{DG}(a_n;s)=\sum _{n=1}^{\infty} \frac{a_n}{n^s}.

The Dirichlet series generating function is especially useful when an is a multiplicative function, when it has an Euler product expression[5] in terms of the function’s Bell series

\operatorname{DG}(a_n;s)=\prod_{p} \operatorname{BG}_p(a_n;p^{-s})\,.

If an is a Dirichlet character then its Dirichlet series generating function is called a Dirichlet L-series.

[edit]Polynomial sequence generating functions

The idea of generating functions can be extended to sequences of other objects. Thus, for example, polynomial sequences of binomial type are generated by

e^{xf(t)}=\sum_{n=0}^\infty {p_n(x) \over n!}t^n

where pn(x) is a sequence of polynomials and f(t) is a function of a certain form. Sheffer sequences are generated in a similar way. See the main article generalized Appell polynomials for more information.

[edit]Ordinary generating functions

Polynomials are a special case of ordinary generating functions, corresponding to finite sequences, or equivalently sequences that vanish after a certain point. These are important in that many finite sequences can usefully be interpreted as generating functions, such as the Poincaré polynomial, and others.

A key generating function is the constant sequence 1, 1, 1, 1, 1, 1, 1, 1, 1, …, whose ordinary generating function is

\sum_{n=0}^{\infty}x^n={1\over1-x}.

The left-hand side is the Maclaurin series expansion of the right-hand side. Alternatively, the right-hand side expression can be justified by multiplying the power series on the left by 1 − x, and checking that the result is the constant power series 1, in other words that all coefficients except the one of x0 vanish. Moreover there can be no other power series with this property. The left-hand side therefore designates the multiplicative inverse of 1 − x in the ring of power series.

Expressions for the ordinary generating function of other sequences are easily derived from this one. For instance, the substitution x → ax gives the generating function for the geometric sequence1,a,a2,a3,… for any constant a:

\sum_{n=0}^{\infty}(ax)^n={1\over1-ax}\,.

(The equality also follows directly from the fact that the left-hand side is the Maclaurin series expansion of the right-hand side.) In particular,

\sum_{n=0}^{\infty}(-1)^nx^n={1\over1+x}\,.

One can also introduce regular “gaps” in the sequence by replacing x by some power of x, so for instance for the sequence 1, 0, 1, 0, 1, 0, 1, 0, …. one gets the generating function

\sum_{n=0}^{\infty}x^{2n}={1\over1-x^2}.

By squaring the initial generating function, or by finding the derivative of both sides with respect to x and making a change of running variable n → n-1, one sees that the coefficients form the sequence 1, 2, 3, 4, 5, …, so one has

\sum_{n=0}^{\infty}(n+1)x^n={1\over(1-x)^2},

and the third power has as coefficients the triangular numbers 1, 3, 6, 10, 15, 21, … whose term n is the binomial coefficient \tbinom{n+2}2, so that

\sum_{n=0}^{\infty}\tbinom{n+2}2 x^n={1\over(1-x)^3}.

More generally, for any positive integer k, it is true that

\sum_{n=0}^{\infty}\tbinom{n+k}k x^n={1\over(1-x)^{k+1}}.

Note that, since

\binom{n+2}2=\frac12{(n+1)(n+2)} =\frac12(n^2+3n+2)\,,

one can find the ordinary generating function for the sequence 0, 1, 4, 9, 16, … of square numbers by linear combination of binomial-coefficient generating sequences;

G(n^2;x)=\sum_{n=0}^{\infty}n^2x^n={2\over(1-x)^3}-{3\over(1-x)^2}+{1\over1-x}=\frac{x(x+1)}{(1-x)^3}.

[edit]Rational functions

The ordinary generating function of a sequence can be expressed as a rational function (the ratio of two polynomials) if and only if the sequence is a linear recursive sequence; this generalizes the examples above.

[edit]Multiplication yields convolution

Main article: Cauchy product

Multiplication of ordinary generating functions yields a discrete convolution (the Cauchy product) of the sequences. For example, the sequence of cumulative sums (a_0, a_0 + a_1, a_0 + a_1 + a_2, \ldots) of a sequence with ordinary generating function G(anx) has the generating function G(a_n; x) \frac{1}{1-x} because 1/(1-x) is the ordinary generating function for the sequence (1, 1, …).

[edit]Relation to discrete-time Fourier transform

When the series converges absolutelyG\left(a_n; e^{-i \omega}\right) = \sum_{n=0}^\infty a_n e^{-i \omega n} is the discrete-time Fourier transform of the sequence a0, a1, ….

[edit]Asymptotic growth of a sequence

In calculus, often the growth rate of the coefficients of a power series can be used to deduce a radius of convergence for the power series. The reverse can also hold; often the radius of convergence for a generating function can be used to deduce the asymptotic growth of the underlying sequence.

For instance, if an ordinary generating function G(anx) that has a finite radius of convergence of r can be written as

G(a_n; x) = \frac{A(x) + B(x) (1- x/r)^{-\beta}}{x^{\alpha}} \,

where A(x) and B(x) are functions that are analytic to a radius of convergence greater than r (or are entire), and where B(r) ≠ 0 then

a_n \sim \frac{B(r)}{r^{\alpha} \Gamma(\beta)} \, n^{\beta-1}(1/r)^{n}\,,

using the Gamma function.

[edit]Asymptotic growth of the sequence of squares

As derived above, the ordinary generating function for the sequence of squares is \frac{x(x+1)}{(1-x)^3}\,. With r = 1, α = 0, β = 3, A(x) = 0, and B(x) = x(x+1), we can verify that the squares grow as expected, like the squares:

a_n \sim \frac{B(r)}{r^{\alpha} \Gamma(\beta)} \, n^{\beta-1}(1/r)^{n} = \frac{1(1+1)}{1^0\,\Gamma(3)}\,n^{3-1} (1/1)^n = n^2\,.

[edit]Asymptotic growth of the Catalan numbers

Main article: Catalan number

The ordinary generating function for the Catalan numbers is \frac{1-\sqrt{1-4x}}{2x}\,. With r = 1/4, α = 1, β = −1/2, A(x) = 1/2, and B(x) = −1/2, we can conclude that, for the Catalan numbers,

a_n \sim \frac{B(r)}{r^{\alpha} \Gamma(\beta)} \, n^{\beta-1}(1/r)^{n}<br />
= \frac{-1/2}{(1/4)^1 \Gamma(-1/2)} \, n^{-1/2-1} \left(\frac{1}{1/4}\right)^n<br />
= \frac{n^{-3/2} \, 4^n}{\sqrt{\pi}} \,.

[edit]Bivariate and multivariate generating functions

One can define generating functions in several variables for arrays with several indices. These are called multivariate generating functions or, sometimes, super generating functions. For two variables, these are often called bivariate generating functions.

For instance, since (1+x)^n is the ordinary generating function for binomial coefficients for a fixed n, one may ask for a bivariate generating function that generates the binomial coefficients \binom{n}{k} for all k and n. To do this, consider (1+x)^n as itself a series, in n, and find the generating function in y that has these as coefficients. Since the generating function for a^n is 1/(1-ay), the generating function for the binomial coefficients is:

\sum_{n,k} \binom n k x^k y^n = \frac{1}{1-(1+x)y}=\frac{1}{1-y-xy}\,.

[edit]Examples

Generating functions for the sequence of square numbers an = n2 are:

[edit]Ordinary generating function

G(n^2;x)=\sum_{n=0}^{\infty}n^2x^n=\frac{x(x+1)}{(1-x)^3}

[edit]Exponential generating function

\operatorname{EG}(n^2;x)=\sum _{n=0}^{\infty} \frac{n^2x^n}{n!}=x(x+1)e^x

[edit]Bell series

\operatorname{BG}_p(a_n;x)=\sum_{n=0}^\infty (p^{n})^2x^n=\frac{1}{1-p^2x}

[edit]Dirichlet series generating function

\operatorname{DG}(n^2;s)=\sum_{n=1}^{\infty} \frac{n^2}{n^s}=\zeta(s-2)\,,

using the Riemann zeta function.

The sequence a_n generated by a Dirichlet series generating function corresponding to:

\operatorname{DG}(a_n;s)=\zeta(s)^m

where \zeta(s) is the Riemann zeta function, has the ordinary generating function:

\begin{align}<br />
\sum \limits_{n=1}^{\infty} a_nx^n<br />
= x &+ {m \choose 1}\sum \limits_{a=2}^{\infty} x^{a}<br />
+ {m \choose 2}\sum \limits_{a=2}^{\infty} \sum \limits_{b=2}^{\infty} x^{ab} \\<br />
&+ {m \choose 3}\sum \limits_{a=2}^{\infty} \sum \limits_{b=2}^{\infty} \sum \limits_{c=2}^{\infty} x^{abc} + {m \choose 4}\sum \limits_{a=2}^{\infty} \sum \limits_{b=2}^{\infty} \sum \limits_{c=2}^{\infty} \sum \limits_{d=2}^{\infty} x^{abcd} +...<br />
\end{align}

[edit]Multivariate generating function

Multivariate generating functions arise in practice when calculating the number of contingency tables of non-negative integers with specified row and column totals. Suppose the table has r rows and c columns; the row sums are t_1,\ldots t_r and the column sums are s_1,\ldots s_c. Then, according to I. J. Good,[6] the number of such tables is the coefficient of x_1^{t_1}\ldots x_r^{t_r}y_1^{s_1}\ldots y_c^{s_c} in

<br />
\prod_{i=1}^{r}\prod_{j=1}^c\frac{1}{1-x_iy_j}.<br />

[edit]Applications

Generating functions are used to

  • Find a closed formula for a sequence given in a recurrence relation. For example consider Fibonacci numbers.
  • Find recurrence relations for sequences—the form of a generating function may suggest a recurrence formula.
  • Find relationships between sequences—if the generating functions of two sequences have a similar form, then the sequences themselves may be related.
  • Explore the asymptotic behaviour of sequences.
  • Prove identities involving sequences.
  • Solve enumeration problems in combinatorics and encoding their solutions. Rook polynomials are an example of an application in combinatorics.
  • Evaluate infinite sums.

[edit]Other generating functions

Examples of polynomial sequences generated by more complex generating functions include:

[edit]Similar concepts

Polynomial interpolation is finding a polynomial whose values (not coefficients) agree with a given sequence; the Hilbert polynomial is an abstract case of this in commutative algebra.

[edit]See also

[edit]Notes

  1. ^ Donald E. KnuthThe Art of Computer Programming, Volume 1 Fundamental Algorithms (Third Edition) Addison-Wesley. ISBN 0-201-89683-4. Section 1.2.9: Generating Functions, pp. 86
  2. ^ This alternative term can already be found in E.N. Gilbert, Enumeration of Labeled graphs, Canadian Journal of Mathematics 3, 1956, p. 405–411, but its use is rare before the year 2000; since then it appears to be increasing
  3. ^ Apostol (1976) pp.42–43
  4. ^ Wilf (1994) p.56
  5. ^ Wilf (1994) p.59
  6. ^ Good, I. J. (1986). “On applications of symmetric Dirichlet distributions and their mixtures to contingency tables”. The Annals of Statistics 4 (6): 1159–1189. doi:10.1214/aos/1176343649.

[edit]References

[edit]External links

Posted in 维基转载 | Tagged , | 1 Comment

Change Google Drive Default Folder Location in Windows

About: There are several ways to change or move Google Drive default folder location. On this topic, we will discuss the two easiest ways to do that.

Google Drive for Windows is a client program that will let you sync online and local files. It also gives you the opportunity to transfer and share files to other users without going through the online login. When Google Drive for Windows is installed on the computer, selected files will be accessible from anywhere.

After you install the client software, it will add Google Drive folder under your own user account, which is C:\Users\[your name]\Google Drive. Since C: Drive contains all your Windows programs and essential files, time will come that you may run out of space and will compel to move the default folder location of Google Drive.

That option is available, simply follow this procedure to change default folder location of Google Drive in Windows systems.

First Procedure – Disconnect Google Drive Account

1. Go to system tray and click on the Google Drive icon. Choose Preferences. It will open a window, click on  Disconnect Account. This will break your current connection so be sure that there is no existing file transfer before you execute this step.

Disconnect Google Drive account

2. Let us presume that all your Google Drive files and folders are updated. Please delete the existing Google Drive folder as well as shortcut under the Favorites. You can execute this through Windows Explorer.

3. Back again at the system tray, please click on Sign in. Enter your user name and password.

4. It will display the welcome screen as if your are installing the program for the first time. Click onNext button.

5. On the next screen, click on Advance Setup. Then, click on Change button and navigate to the new location of Google Drive folder. You can create a new folder by clicking on a Make New Folder button if you want the drive to be inside a folder. Choose a drive letter if you prefer Google Drive folder to be on the root.

Advanced Setup

Google Drive new folder

6. Be sure that the folder you are choosing is empty, otherwise it will display the message “The Google Drive folder you selected is not empty. Please select an empty folder”, you have no other choice but to delete all contents of the target folder.

7. Click Start Sync. That will set your new location as default Google Drive folder. You can now delete the old folder.

Second Procedure – Move Existing Google Drive Folder (without deleting the old)

1. Open Windows Explorer and go to your existing Google Drive folder. Create a folder “Empty”. Make sure that it sync with your account.

2. Next create a desired Folder for your new Google Drive location.

3. Go to Windows system tray and look for Google Drive icon. Go to Preferences and click Disconnect account…

4. Sign-in to Google Drive once more.

5. You will see a welcome screen (1 of 2), click Next.

6. On the next page (2 of 2), click on Advanced setup.

Advanced Setup

7. Go to Folder location area and click on Change… Browse for the new folder you just created to be the new default Google Drive.

8. On Sync Options, select “Only sync some folders to this computer”. Put a check mark on folderEmpty. This will allow Google Drive to sync for a minimal time before it sets the new location.

9. Click on Start sync. Google Drive will recognize the new location in less time because it syncs an empty folder only.

Sync Temp Folder Only

10. You may now move files from OLD Google Drive folder to a new location using Windows Explorer. Make sure that you will retain the directory structure. You can keep files from old folder or delete it to save space when syncing is complete.

11. To retain default value for Sync options, click on Google Drive icon on system tray. SelectPreferences.

12. Now, remove the check mark on “Only sync some folders to this computer”. Click on Apply changes to save the settings.

Posted in 网络转载 | Tagged , , | Leave a comment

de Moivre’s formula

de Moivre’s formula

From Wikipedia

In mathematicsde Moivre’s formula (a.k.a. De Moivre’s theorem and De Moivre’s identity), named after Abraham de Moivre, states that for any complex number (and, in particular, for anyreal numberx and integer n it holds that

(\cos x + i \sin x)^n = \cos (nx) + i \sin (nx).\,

While the formula was named after De Moivre, he never explicitly stated it in his works.[1]

The formula is important because it connects complex numbers (i stands for the imaginary unit (i2 = −1)) and trigonometry. The expression cos x + i sin x is sometimes abbreviated to cis x.

By expanding the left hand side and then comparing the real and imaginary parts under the assumption that x is real, it is possible to derive useful expressions for cos(nx) and sin(nx) in terms of cos x and sin x. Furthermore, one can use a generalization of this formula to find explicit expressions for the nth roots of unity, that is, complex numbers z such that zn = 1.

Contents

[hide]

[edit]Derivation

Although historically proven earlier, de Moivre’s formula can easily be derived from Euler’s formula

e^{ix} = \cos x + i\sin x\,

and the exponential law for integer powers

\left( e^{ix} \right)^n = e^{inx} .

Then, by Euler’s formula,

e^{i(nx)} = \cos (nx) + i\sin (nx).\,

[edit]Failure for non-integer powers

De Moivre’s formula does not, in general, hold for non-integer powers. Non-integer powers of a complex number can have many different values, see failure of power and logarithm identities. However there is a generalization that the right-hand side expression is one possible value of the power.

The derivation of de Moivre’s formula above involves a complex number to the power n. When the power is not an integer, the result is multiple-valued, for example, when n = ½ then:

For x = 0 the formula gives 1½ = 1
For x = 2π the formula gives 1½ = −1.

Since the angles 0 and 2π are the same this would give two different values for the same expression. The values 1 and −1 are however both square roots of 1 as the generalization asserts.

No such problem occurs with Euler’s formula since there is no identification of different values of its exponent. Euler’s formula involves a complex power of a positive real number and this always has a defined value. The corresponding expressions are:

e^{i0}=1
e^{i \pi}= -1.

[edit]Proof by induction (for integer n)

The truth of de Moivre’s theorem can be established by mathematical induction for natural numbers, and extended to all integers from there. Consider S(n):

(\cos x + i \sin x)^n = \cos (nx) + i \sin (nx), n \in \mathbb{Z}.

For n > 0, we proceed by mathematical inductionS(1) is clearly true. For our hypothesis, we assume S(k) is true for some natural k. That is, we assume

\left(\cos x + i \sin x\right)^k = \cos\left(kx\right) + i \sin\left(kx\right). \,

Now, considering S(k+1):

<br /><br />
\begin{alignat}{2}<br /><br />
    \left(\cos x+i\sin x\right)^{k+1} & = \left(\cos x+i\sin x\right)^{k} \left(\cos x+i\sin x\right)\\<br /><br />
                                      & = \left[\cos\left(kx\right) + i\sin\left(kx\right)\right] \left(\cos x+i\sin x\right) &&\qquad \text{by the induction hypothesis}\\<br /><br />
                                      & = \cos \left(kx\right) \cos x - \sin \left(kx\right) \sin x + i \left[\cos \left(kx\right) \sin x + \sin \left(kx\right) \cos x\right]\\<br /><br />
                                      & = \cos \left[ \left(k+1\right) x \right] + i\sin \left[ \left(k+1\right) x \right] &&\qquad \text{by the trigonometric identities}<br /><br />
\end{alignat}<br /><br />

We deduce that S(k) implies S(k+1). By the principle of mathematical induction it follows that the result is true for all natural numbers. Now, S(0) is clearly true since cos (0x) + i sin(0x) = 1 +i 0 = 1. Finally, for the negative integer cases, we consider an exponent of -n for natural n.

<br /><br />
\begin{align}<br /><br />
     \left(\cos x + i\sin x\right)^{-n} & = \left[ \left(\cos x + i\sin x\right)^n \right]^{-1} \\<br /><br />
                                       & = \left[\cos (nx) + i\sin (nx)\right]^{-1} \\<br /><br />
                                       & = \cos(-nx) + i\sin (-nx). \qquad (*) \\<br /><br />
\end{align}<br /><br />

The equation (*) is a result of the identity z^{-1} = \frac{\bar{z}}{|z|^2}, for z = cos nx + i sin nx. Hence, S(n) holds for all integers n.

[edit]Formulas for cosine and sine individually

Being an equality of complex numbers, one necessarily has equality both of the real parts and of the imaginary parts of both members of the equation. If x, and therefore also cos x and sin x, arereal numbers, then the identity of these parts can be written using binomial coefficients. This formula was given by 16th century French mathematician Franciscus Vieta:

\sin nx = \sum_{k=0}^n \binom{n}{k} \cos^kx\,\sin^{n-k}x\,\sin\left(\frac{1}{2}(n-k)\pi\right)
\cos nx = \sum_{k=0}^n \binom{n}{k} \cos^kx\,\sin^{n-k}x\,\cos\left(\frac{1}{2}(n-k)\pi\right).

In each of these two equations, the final trigonometric function equals one or minus one or zero, thus removing half the entries in each of the sums. These equations are in fact even valid for complex values of x, because both sides are entire (that is, holomorphic on the whole complex plane) functions of x, and two such functions that coincide on the real axis necessarily coincide everywhere. Here are the concrete instances of these equations for n = 2 and n = 3:

\begin{alignat}2<br /><br />
  \cos(2x) &= (\cos{x})^2 +((\cos{x})^2-1) &&= 2(\cos{x})^2-1\\<br /><br />
  \sin(2x) &= 2(\sin{x})(\cos{x})\\<br /><br />
  \cos(3x) &= (\cos{x})^3 +3\cos{x}((\cos{x})^2-1) &&= 4(\cos{x})^3-3\cos{x}\\<br /><br />
  \sin(3x) &= 3(\cos{x})^2(\sin{x})-(\sin{x})^3 &&= 3\sin{x}-4(\sin{x})^3.\\<br /><br />
\end{alignat}

The right hand side of the formula for cos(nx) is in fact the value Tn(cos x) of the Chebyshev polynomial Tn at cos x.

[edit]Generalization

The formula is actually true in a more general setting than stated above: if z and w are complex numbers, then

\left(\cos z + i\sin z\right)^w

is a multi-valued function while

\cos (wz) + i \sin (wz)\,

is not. However, it still holds that

\cos (wz) + i \sin (wz) \text{ is one value of } \left(\cos z + i\sin z\right)^w.\,

[edit]Applications

This formula can be used to find the nth roots of a complex number. This application does not strictly use de Moivre’s formula as the power is not an integer.[citation needed] However considering the right hand side to the power of n will, in each case, give the same value on the left-hand side.

If z is a complex number, written in polar form as

z=r\left(\cos x+i\sin x\right),\,

then

<br /><br />
z^{1/n} = \left[ r\left( \cos x+i\sin x \right) \right]^{1/n} = r^{1/n} \left[ \cos \left( \frac{x+2k\pi}{n} \right) + i\sin \left( \frac{x+2k\pi}{n} \right) \right]<br /><br />

where k is an integer. To get the n different roots of z one only needs to consider values of k from 0 to n − 1.

[edit]Analog for hyperbolic trigonometry

Since \cosh x + \sinh x = e^x, an analog to De Moivre’s formula also applies to the hyperbolic trigonometry. For all n \in \mathbb{Z}(\cosh x + \sinh x)^n = \cosh nx + \sinh nx. Also, ifn \in \mathbb{Q}, then one value of (\cosh x + \sinh x)^n will be \cosh nx + \sinh nx.[2]

[edit]Analog for quaternions

To find the roots of a quaternion there is an analogous form of De Moivre’s formula. A quaternion in the form d + a\mathbf{\hat{i}} + b\mathbf{\hat{j}} + c\mathbf{\hat{k}} can be represented in the form q = k(\cos \theta + \epsilon \sin \theta) for0 \leq \theta < 2 \pi. In this representation, k = \sqrt{d^2 + a^2 + b^2 + c^2}, and the trigonometric functions are defiend as \cos \theta = \frac{d}{k} and \sin \theta = \pm \frac{\sqrt{a^2 + b^2 + c^2}}{k}. In the case that a^2 + b^2 + c^2 \neq 0\epsilon = \pm \frac{a\mathbf{\hat{i}} + b\mathbf{\hat{j}} + c\mathbf{\hat{k}}}{\sqrt{a^2 + b^2 + c^2}} (the unit vector). This leads to the variation of De Moivre’s formula:

q^n = k^n(\cos n \theta + \epsilon \sin n \theta)[3]

‘Example’ To find the cubic roots of Q = 1 + \mathbf{\hat{i}} +  \mathbf{\hat{j}}+ \mathbf{\hat{k}}, write the quaternion in the form

Q = 2(\cos 60^\circ + \epsilon \sin 60^\circ) \qquad \mathrm{where} \; \epsilon = \frac{\mathbf{\hat{i}} +  \mathbf{\hat{j}}+ \mathbf{\hat{k}}}{\sqrt{3}}

Then the cubic roots are given by q:

q = \sqrt[3]{2}(\cos \theta + \epsilon \sin \theta) \qquad  \theta = 20^\circ, 140^\circ, 260^\circ

[edit]References

  1. ^ Daniels, Margaret L. Lial, John Hornsby, David I. Schneider ;with assistance of Callie J. (2008). College algebra and trigonometry (4th ed. ed.). Boston: Pearson/Addison Wesley. p. 792.ISBN 9780321497444.
  2. ^ Mukhopadhyay, Utpal (NaN undefined NaN). “Some interesting features of hyperbolic functions”. Resonance 11 (8): 81–85. doi:10.1007/BF02855783.
  3. ^ Brand, Louis (October 1942). “The roots of a quaternion”The American Mathematical Monthly 49 (8): 519-520. Retrieved 19 August 2012.

[edit]External links

Posted in 维基转载 | Tagged | Leave a comment

Reduction (complexity)

Reduction (complexity)

From Wikipedia, the free encyclopedia

Example of a reduction from a boolean satisfiability problem to a vertex cover problem. Blue vertices form a vertex cover which corresponds to truth values.

In computability theory and computational complexity theory, a reduction is a transformation of one problem into another problem. Depending on the transformation used this can be used to define complexity classes on a set of problems.

Intuitively, problem A is reducible to problem B if an algorithm for solving problem B efficiently (if it existed) could also be used as a subroutine to solve problem A efficiently. When this is true, solving A cannot be harder than solving B. We write A ≤m B, usually with a subscript on the ≤ to indicate the type of reduction being used (m : mapping reduction, p : polynomial reduction).

Contents

[hide]

[edit]Introduction

Often we find ourselves trying to solve a problem that is similar to a problem we’ve already solved. In these cases, often a quick way of solving the new problem is to transform each instance of the new problem into instances of the old problem, solve these using our existing solution, and then use these to obtain our final solution. This is perhaps the most obvious use of reductions.

Another, more subtle use is this: suppose we have a problem that we’ve proven is hard to solve, and we have a similar new problem. We might suspect that it, too, is hard to solve. We argue by contradiction: suppose the new problem is easy to solve. Then, if we can show that every instance of the old problem can be solved easily by transforming it into instances of the new problem and solving those, we have a contradiction. This establishes that the new problem is also hard.

A very simple example of a reduction is from multiplication to squaring. Suppose all we know how to do is to add, subtract, take squares, and divide by two. We can use this knowledge, combined with the following formula, to obtain the product of any two numbers:

a \times b = \frac{\left(\left(a + b\right)^{2} - a^{2} - b^{2}\right)}{2}

We also have a reduction in the other direction; obviously, if we can multiply two numbers, we can square a number. This seems to imply that these two problems are equally hard. This kind of reduction corresponds to Turing reduction.

However, the reduction becomes much harder if we add the restriction that we can only use the squaring function one time, and only at the end. In this case, even if we’re allowed to use all the basic arithmetic operations, including multiplication, no reduction exists in general, because we may have to compute an irrational number like \sqrt{2} from rational numbers. Going in the other direction, however, we can certainly square a number with just one multiplication, only at the end. Using this limited form of reduction, we have shown the unsurprising result that multiplication is harder in general than squaring. This corresponds to many-one reduction.

[edit]Definition

Given two subsets A and B of N and a set of functions F from N to N which is closed under compositionA is called reducible to B under F if

\exists f \in F \mbox{ . } \forall x \in \mathbb{N} \mbox{ . } x \in A \Leftrightarrow f(x) \in B

We write

A \leq_{F} B

Let S be a subset of P(N) and ≤ a reduction, then S is called closed under ≤ if

\forall s \in S \mbox{ . } \forall A \in P(\mathbb{N}) \mbox{ . } A \leq s \Rightarrow A \in S

A subset A of N is called hard for S if

\forall s \in S \mbox{ . } s \leq A

A subset A of N is called complete for S if A is hard for S and A is in S.

[edit]Properties

A reduction is a preordering, that is a reflexive and transitive relation, on P(NP(N), where P(N) is the power set of the natural numbers.

[edit]Types and applications of reductions

As described in the example above, there are two main types of reductions used in computational complexity, the many-one reduction and the Turing reduction. Many-one reductions mapinstances of one problem to instances of another; Turing reductions compute the solution to one problem, assuming the other problem is easy to solve. A many-one reduction is weaker than a Turing reduction. Weaker reductions are more effective at separating problems, but they have less power, making reductions harder to design.

A problem is complete for a complexity class if every problem in the class reduces to that problem, and it is also in the class itself. In this sense the problem represents the class, since any solution to it can, in combination with the reductions, be used to solve every problem in the class.

However, in order to be useful, reductions must be easy. For example, it’s quite possible to reduce a difficult-to-solve NP-complete problem like the boolean satisfiability problem to a trivial problem, like determining if a number equals zero, by having the reduction machine solve the problem in exponential time and output zero only if there is a solution. However, this does not achieve much, because even though we can solve the new problem, performing the reduction is just as hard as solving the old problem. Likewise, a reduction computing a noncomputable function can reduce an undecidable problem to a decidable one. As Michael Sipser points out in Introduction to the Theory of Computation: “The reduction must be easy, relative to the complexity of typical problems in the class [...] If the reduction itself were difficult to compute, an easy solution to the complete problem wouldn’t necessarily yield an easy solution to the problems reducing to it.”

Therefore, the appropriate notion of reduction depends on the complexity class being studied. When studying the complexity class NP and harder classes such as the polynomial hierarchy,polynomial-time reductions are used. When studying classes within P such as NC and NLlog-space reductions are used. Reductions are also used in computability theory to show whether problems are or are not solvable by machines at all; in this case, reductions are restricted only to computable functions.

In case of optimization (maximization or minimization) problems, we often think in terms of approximation-preserving reductions. Suppose we have two optimization problems such that instances of one problem can be mapped onto instances of the other, in a way that nearly optimal solutions to instances of the latter problem can be transformed back to yield nearly optimal solutions to the former. This way, if we have an optimization algorithm (or approximation algorithm) that finds near-optimal (or optimal) solutions to instances of problem B, and an efficient approximation-preserving reduction from problem A to problem B, by composition we obtain an optimization algorithm that yields near-optimal solutions to instances of problem A. Approximation-preserving reductions are often used to prove hardness of approximation results: if some optimization problem A is hard to approximate (under some complexity assumption) within a factor better than α for some α, and there is a β-approximation-preserving reduction from problem A to problem B, we can conclude that problem B is hard to approximate within factor α/β.

[edit]Examples

[edit]Detailed example

The following example shows how to use reduction from the halting problem to prove that a language is undecidable. Suppose H(Mw) is the problem of determining whether a given Turing machine M halts (by accepting or rejecting) on input string w. This language is known to be undecidable. Suppose E(M) is the problem of determining whether the language a given Turing machineM accepts is empty (in other words, whether M accepts any strings at all). We show that E is undecidable by a reduction from H.

To obtain a contradiction, suppose R is a decider for E. We will use this to produce a decider S for H (which we know does not exist). Given input M and w (a Turing machine and some input string), define S(Mw) with the following behavior: S creates a Turing machine N that accepts only if the input string to N is w and M halts on input w, and does not halt otherwise. The decider Scan now evaluate R(N) to check whether the language accepted by N is empty. If R accepts N, then the language accepted by N is empty, so in particular M does not halt on input w, so S can reject. If R rejects N, then the language accepted by N is nonempty, so M does halt on input w, so S can accept. Thus, if we had a decider R for E, we would be able to produce a decider S for the halting problem H(Mw) for any machine M and input w. Since we know that such an S cannot exist, it follows that the language E is also undecidable.

Posted in 维基转载 | Tagged , | Leave a comment

Maximum flow problem

Maximum flow problem

From Wikipedia, the free encyclopedia

An example of a flow network with a maximum flow. The source is s, and the sink t. The numbers denote flow and capacity.

In optimization theory, maximum flow problems involve finding a feasible flow through a single-source, single-sink flow network that is maximum.

The maximum flow problem can be seen as a special case of more complex network flow problems, such as the circulation problem. The maximum value of an s-t flow is equal to the minimum capacity of an s-t cut in the network, as stated in the max-flow min-cut theorem.

Contents

  [hide

[edit]History

The maximum flow problem was first formulated in 1954 by T. E. Harris as a simplified model of Soviet railway traffic flow.[1] In 1955, Lester R. Ford, Jr. and Delbert R. Fulkerson created the first known algorithm, the Ford–Fulkerson algorithm.[2][3]

Over the years, various improved solutions to the maximum flow problem were discovered, notably the shortest augmenting path algorithm of Edmonds and Karp and independently Dinitz; the blocking flow algorithm of Dinitz; the push-relabel algorithm of Goldberg and Tarjan; and the binary blocking flow algorithm of Goldberg and Rao. The electrical flow algorithm of Christiano, Kelner, Madry, and Spielman finds an approximately optimal maximum flow but only works in undirected graphs.

[edit]Definition

A flow network, with source s and sink t. The numbers next to the edge are the capacities.

Let \scriptstyle N = (V, E) be a network with \scriptstyle s, t \in V being the source and the sink of \scriptstyle N respectively.

The capacity of an edge is a mapping \scriptstyle c : E \to \mathbb{R}^+, denoted by \scriptstyle c_{uv} or \scriptstyle c(u, v). It represents the maximum amount of flow that can pass through an edge.
A flow is a mapping \scriptstyle f : E \to \mathbb{R}^+, denoted by \scriptstyle f_{uv} or \scriptstyle f(u, v), subject to the following two constraints:
1. \scriptstyle f_{uv} \leq c_{uv}, for each \scriptstyle (u, v) \in E (capacity constraint: the flow of an edge cannot exceed its capacity)
2. \scriptstyle \sum_{u:(u, v) \in E} f_{uv} = \sum_{u:(v, u) \in E} f_{vu}, for each \scriptstyle v \in V \setminus \{s, t\} (conservation of flows: the sum of the flows entering a node must equal the sum of the flows exiting a node, except for the source and the sink nodes)
The value of flow is defined by \scriptstyle |f| = \sum_{v \in V} f_{sv}, where \scriptstyle s is the source of \scriptstyle N. It represents the amount of flow passing from the source to the sink.

The maximum flow problem is to maximize \scriptstyle |f|, that is, to route as much flow as possible from \scriptstyle s to \scriptstyle t.

[edit]Solutions

We can define the Residual Graph, which provides a systematic way to search for forward-backward operations in order to find the maximum flow.

Given a flow network \scriptstyle G, and a flow \scriptstyle f on \scriptstyle G, we define the residual graph \scriptstyle G_{f} of \scriptstyle G with respect to \scriptstyle f as follows.

1. The node set of \scriptstyle G_{f} is the same as that of \scriptstyle G.

2. Each edge \scriptstyle e = (u, v) of \scriptstyle G_{f} is with a capacity of \scriptstyle c_{e} - f(e).

3. Each edge \scriptstyle e' = (v, u) of \scriptstyle G_{f} is with a capacity of \scriptstyle f(e).

The following table lists algorithms for solving the maximum flow problem.

Method
Complexity
Description

Linear programming
Constraints given by the definition of a legal flow. See the linear program here.

Ford–Fulkerson algorithm
O(E max| f |)
As long as there is an open path through the residual graph, send the minimum of the residual capacities on the path.

The algorithm works only if all weights are integers. Otherwise it is possible that the Ford–Fulkerson algorithm will not converge to the maximum value.

Edmonds–Karp algorithm
O(VE2)
A specialization of Ford–Fulkerson, finding augmenting paths with breadth-first search.

Dinitz blocking flow algorithm
O(V2E)
In each phase the algorithms builds a layered graph with breadth-first search on the residual graph. The maximum flow in a layered graph can be calculated in O(VE) time, and the maximum number of the phases is n-1. In networks with unit capacities, Dinic’s algorithm terminates in \scriptscriptstyle  O(E\sqrt{V}) time.

General push-relabel maximum flow algorithm
O(V2E)
The push relabel algorithm maintains a preflow, i.e. a flow function with the possibility of excess in the vertices. The algorithm runs while there is a vertex with positive excess, i.e. an active vertex in the graph. The push operation increases the flow on a residual edge, and a height function on the vertices controls which residual edges can be pushed. The height function is changed with a relabel operation. The proper definitions of these operations guarantee that the resulting flow function is a maximum flow.

Push-relabel algorithm withFIFO vertex selection rule
O(V3)
Push-relabel algorithm variant which always selects the most recently active vertex, and performs push operations until the excess is positive or there are admissible residual edges from this vertex.

Dinitz blocking flow algorithm with dynamic trees
O(VE log(V))
The dynamic trees data structure speeds up the maximum flow computation in the layered graph to O(Elog(V)).

Push-relabel algorithm with dynamic trees
O(VE log(V2/E))
The algorithm builds limited size trees on the residual graph regarding to height function. These trees provide multilevel push operations.

Binary blocking flow algorithm [1]
\scriptscriptstyle O(E\min(V^{2/3},\sqrt{E})\log(V^2/E)\log{U})
The value U corresponds to the maximum capacity of the network.

MPM (Malhotra, Pramodh-Kumar and Maheshwari) algorithm
O(V3)

For a more extensive list, see [2].

[edit]Integral flow theorem

The integral flow theorem states that

If each edge in a flow network has integral capacity, then there exists an integral maximal flow.

[edit]Application

[edit]Multi-source multi-sink maximum flow problem

Fig. 4.1.1. Transformation of a multi-source multi-sink maximum flow problem into a single-source single-sink maximum flow problem

Given a network N = (V,E) with a set of sources S = {s1, …, sn} and a set of sinks T = {t1, …, tm} instead of only one source and one sink, we are to find the maximum flow across N. We can transform the multi-source multi-sink problem into a maximum flow problem by adding a consolidated source connecting to each vertex in S and a consolidated sink connected by each vertex in T (also known as supersource and supersink) with infinite capacity on each edge (See Fig. 4.1.1.).

[edit]Minimum path cover in directed acyclic graph

Given a directed acyclic graph G = (V, E), we are to find the minimum number of paths to cover each vertex in V. We can construct a bipartite graph G‘ = (VoutVin, E‘ ) from G, where

  1. Vout = {vV: v has positive out-degree}.
  2. Vin = {vV: v has positive in-degree}.
  3. E‘ = {(u,v)∈(Vout,Vin): (u,v)∈E}.

Then it can be shown, via König’s theorem, that G‘ has a matching of size m if and only if there exists n-m paths that cover each vertex in G, where n is the number of vertices in G. Therefore, the problem can be solved by finding the maximum cardinality matching in G‘ instead.

[edit]Maximum cardinality bipartite matching

Fig. 4.3.1. Transformation of a maximum bipartite matching problem into a maximum flow problem

Given a bipartite graph G = (XY, E), we are to find a maximum cardinality matching in G, that is a matching that contains the largest possible number of edges. This problem can be transformed into a maximum flow problem by constructing a network N = (XY∪{s,t}, E‘ }, where

  1. E’ contains the edges in G directed from X to Y.
  2. (s,x)∈E‘ for each xX and (y,t)∈E‘ for each yY.
  3. c(e) = 1 for each eE’ (See Fig. 4.3.1).

Then the value of the maximum flow in N is equal to the size of the maximum matching in G.

[edit]Maximum flow problem with vertex capacities

Fig. 4.4.1. Transformation of a maximum flow problem with vertex capacities constraint into the original maximum flow problem by node splitting

Given a network N = (V, E), in which there is capacity at each node in addition to edge capacity, that is, a mapping c: VR+, denoted by c(v), such that the flow f has to satisfy not only the capacity constraint and the conservation of flows, but also the vertex capacity constraint

Σi∈Vfivc(v) for each vV∖{s,t}.

In other words, the amount of flow passing through a vertex cannot exceed its capacity.

To find the maximum flow across N, we can transform the problem into the maximum flow problem in the original sense by expanding N. First, each vV is replaced by vin and vout, where vin is connected by edges going into v and vout is connected to edges coming out from v, then assign capacity c(v) to the edge connecting vin and vout (See Fig. 4.4.1). In this expanded network, the vertex capacity constraint is removed and therefore the problem can be treated as the original maximum flow problem.

[edit]Maximum independent path

Given a directed graph G = (V, E) and two vertices s and t, we are to find the maximum number of independent paths from s to t. Two paths are said to be independent if they do not have a vertex in common apart from s and t. We can construct a network N = (V, E) from G with vertex capacities, where

  1. s and t are the source and the sink of N respectively.
  2. c(v) = 1 for each vV.
  3. c(e) = 1 for each eE.

Then the value of the maximum flow is equal to the maximum number of independent paths from s to t.

[edit]Maximum edge-disjoint path

Given a directed graph G = (V, E) and two vertices s and t, we are to find the maximum number of edge-disjoint paths from s to t. This problem can be transformed to a maximum flow problem by constructing a network N = (V, E) from G with s and t being the source and the sink of Nrespectively and assign each edge with unit capacity.

[edit]See also

[edit]References

  1. ^ Schrijver, Alexander, "On the history of the transportation and maximum flow problems", Mathematical Programming 91 (2002) 437-445
  2. ^ Ford, L.R., Jr.; Fulkerson, D.R., "Maximal Flow through a Network", Canadian Journal of Mathematics (1956), pp.399-404.
  3. ^ Ford, L.R., Jr.; Fulkerson, D.R., Flows in Networks, Princeton University Press (1962).

[edit]Notes

  1. ^ Andrew V. Goldberg and S. Rao (1998). "Beyond the flow decomposition barrier". J. Assoc. Comput. Mach. 45 (5): 753–782. doi:10.1145/290179.290181.
  2. ^ Andrew V. Goldberg and Robert E. Tarjan (1988). "A new approach to the maximum-flow problem". Journal of the ACM (ACM Press) 35 (4): 921–940. doi:10.1145/48014.61051. ISSN 0004-5411.
  3. ^ Joseph Cheriyan and Kurt Mehlhorn (1999). "An analysis of the highest-level selection rule in the preflow-push max-flow algorithm". Information Processing Letters 69 (5): 239–242. doi:10.1016/S0020-0190(99)00019-8.
  4. ^ Daniel D. Sleator and Robert E. Tarjan (1983). "A data structure for dynamic trees". Journal of Computer and System Sciences 26 (3): 362–391. doi:10.1016/0022-0000(83)90006-5. ISSN 0022-0000.
  5. ^ Daniel D. Sleator and Robert E. Tarjan (1985). "Self-adjusting binary search trees". Journal of the ACM (ACM Press) 32 (3): 652–686. doi:10.1145/3828.3835. ISSN 0004-5411.
  6. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein (2001). "26. Maximum Flow". Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill. pp. 643–668. ISBN 0-262-03293-7.
  7. Eugene Lawler (2001). "4. Network Flows". Combinatorial Optimization: Networks and Matroids. Dover. pp. 109–177. ISBN 0-486-41453-1.
Posted in 维基转载 | Tagged , , , , | Leave a comment

ower Summations

ower Summations
(Math | Calculus | Series | Power)
Summation Expansion Equivalent Value Comments
  n
sum k
k=1
= 1 + 2 + 3 + 4 + .. + n = (n2 + n) / 2
= (1/2)n2 + (1/2)n
sum of 1stn integers
  n
sum k 2
k=1
= 1 + 4 + 9 + 16 + .. + n2 = (1/6)n(n+1)(2n+1)
= (1/3)n3 + (1/2)n2 + (1/6)n
sum of 1stn squares
  n
sum k 3
k=1
= 1 + 8 + 27 + 64 + .. + n3 = (1/4)n4 + (1/2)n3 + (1/4)n2 sum of 1stn cubes
  n
sum k 4
k=1
= 1 + 16 + 81 + 256 + .. + n4 = (1/5)n5 + (1/2)n4 + (1/3)n3 - (1/30)n
  n
sum k 5
k=1
= 1 + 32 + 243 + 1024 + .. + n5 = (1/6)n6 + (1/2)n5 + (5/12)n4 - (1/12)n2
  n
sum k 6
k=1
= 1 + 64 + 729 + 4096 + .. + n6 = (1/7)n7 + (1/2)n6 + (1/2)n5 - (1/6)n3 + (1/42)n
  n
sum k 7
k=1
= 1 + 128 + 2187 + 16384 + .. + n7 = (1/8)n8 + (1/2)n7 + (7/12)n6 - (7/24)n4 + (1/12)n2
  n
sum k 8
k=1
= 1 + 256 + 6561 + 65536 + .. + n8 = (1/9)n9 + (1/2)n8 + (2/3)n7 - (7/15)n5 + (2/9)n3 - (1/30)n
  n
sum k 9
k=1
= 1 + 512 + 19683 + 262144 + .. + n9 = (1/10)n10 + (1/2)n9 + (3/4)n8 - (7/10)n6 + (1/2)n4 - (3/20)n2
  n
sum k 10
k=1
= 1 + 1024 + 59049 + 1048576 + .. + n10 = (1/11)n11 + (1/2)n10 + (5/6)n9 - n7 + n5 - (1/2)n3 + (5/66)n
Posted in 网络转载 | Tagged , | 1 Comment

Integrate x/(x^2+1)?

  1. Home >
  2. All Categories >
  3. Science & Mathematics >
  4. Mathematics >
  5. Resolved Question

Resolved Question

Show me another»

Integrate x/(x^2+1)?

the answer should have ln in it. please show some steps that i can understand?

Best Answer - Chosen by Voters

integral x/(x^2+1) dxuse substitution method, so u = x^2 + 1, which means dx = du / 2x

you substitute and get integral of x/u * du / 2x

x’s cancel and you get integral 1/2u * du

move 1/2 constant out and get 1/2 integral 1/u * du

antiderivative of 1/x is lnx, so 1/2 * lnu

answer is (1/2)(ln u)

but still have to plug u back in,

final answer is (1/2)(ln (x^2 + 1))

100% 1 Vote
  • 6 people rated this as good

Not the right answer? Try Yahoo! Search


 

Other Answers (4)

  • u-sub the bottom… leave an x with the du so that they can cancel the x in the numerator… then you have 1/u… and the integral of that is ln(u)
    0% 0 Votes
  • integrate the numerator; so it equals d(1/2x^2). put out 1/2 to be a constanta, and become 1/2d(x^2). we may change it to be 1/2d(x^2+1), because it could be. so the result is ln 1/2(x^2+1)
    0% 0 Votes
  • Substitute: x^2+1
    u = x^2+1
    du = 2xdx.
    xdx = du/2
    1/2integral du/u = 0.5ln IuI = 0.5ln (x^2+1)
    0% 0 Votes
    • 1 person rated this as good
  • Integrate x/(x^2+1) dxYou can get past formal substitution possibly if you can recognize that the numerator needs only a factor of 2 to be the exact differential of the denominator. We can insert that factor, as long as we also provide a factor of 1/2 i.e.

    (½) ⌠ [2xdx] / [x²+ 1] = (½) ln [x²+ 1] + c

    0% 0 Votes
    • 2 people rated this as good

Answers International

Yahoo! does not evaluate or guarantee the accuracy of any Yahoo! Answers content. Click here for the Full Disclaimer.

Help us improve Yahoo! Answers. Tell us what you think.

Copyright © 2012 Yahoo! Asia Pacific Pte Ltd. (Co. Reg. No. 199700735D). All Rights Reserved.

Posted in 网络转载 | Tagged , , | Leave a comment

Closest pair of points problem

Closest pair of points problem

From Wikipedia, the free encyclopedia

  (Redirected from Closest pair of points)

Closest pair of points shown in red

The closest pair of points problem or closest pair problem is a problem of computational geometry: given n points in metric space, find a pair of points with the smallest distance between them. The closest pair problem for points in the Euclidean plane[1] was among the first geometric problems which were treated at the origins of the systematic study of the computational complexity of geometric algorithms.

A naive algorithm of finding distances between all pairs of points and selecting the minimum requires O(dn2) time. It turns out that the problem may be solved in O(n log n) time in a Euclidean space or Lp space of fixed dimension d. In the algebraic decision tree model of computation, the O(n log n) algorithm is optimal. The optimality follows from the observation that the element uniqueness problem (with the lower bound of Ω(n log n) for time complexity) is reducible to the closest pair problem: checking whether the minimal distance is 0 after the solving of the closest pair problem answers the question whether there are two coinciding points.

In the computational model which assumes that the floor function is computable in constant time the problem can be solved in O(n log log n) time.[2] If we allow randomization to be used together with the floor function, the problem can be solved inO(n) time.[3][4]

Contents

  [hide

[edit]Brute-force algorithm

The closest pair of points can be computed in O(n2) time by performing a brute-force search. To do that, one could compute the distances between all the n(n − 1) / 2 pairs of points, then pick the pair with the smallest distance, as illustrated below.

minDist = infinity
for i = 1 to length(P)
 for j = i + 1 to length(P)
  let p = P[i], q = P[j]
  if dist(p, q) < minDist:
   minDist = dist(p, q)
   closestPair = (p, q)
return closestPair

[edit]Planar case

The problem can be solved in O(n log n) time using the recursive divide and conquer approach, e.g., as follows:[1]

  1. Sort points according to their x-coordinates.
  2. Split the set of points into two equal-sized subsets by a vertical line x=xmid.
  3. Solve the problem recursively in the left and right subsets. This yields the left-side and right-side minimum distances dLmin and dRmin, respectively.
  4. Find the minimal distance dLRmin among the pair of points in which one point lies on the left of the dividing vertical and the second point lies to the right.
  5. The final answer is the minimum among dLmin, dRmin, and dLRmin.

Divide-and-conquer: sparse box observation

It turns out that step 4 may be accomplished in linear time. Again, a naive approach would require the calculation of distances for all left-right pairs, i.e., in quadratic time. The key observation is based on the following sparsity property of the point set. We already know that the closest pair of points is no further apart than dist= min(dLmin, dRmin). Therefore for each point p of the left of the dividing line we have to compare the distances to the points that lie in the rectangle of dimensions(dist, 2 ⋅ dist) to the right of the dividing line, as shown in the figure. And what is more, this rectangle can contain at most six points with pairwise distances at most dRmin. Therefore it is sufficient to compute at most 6n left-right distances in step 4. The recurrence relation for the number of steps can be written as T(n) = 2 T(n/2) + O(n), which we can solve using the master theorem to get O(n log n).

As the closest pair of points define an edge in the Delaunay triangulation, and correspond to two adjacent cells in the Voronoi diagram, the closest pair of points can be determined in linear time when we are given one of these two structures. Computing either the Delaunay triangulation or the Voronoi diagram takes O(n log n) time. These approaches are not efficient for dimension d>2, while the divide-and-conquer algorithm can be generalized to take O(n log n) time for any constant value of d.

[edit]Dynamic closest-pair problem

The dynamic version for the closest-pair problem is stated as follows:

  • Given a dynamic set of objects, find algorithms and data structures for efficient recalculation of the closest pair of objects each time the objects are inserted or deleted.

If the bounding box for all points is known in advance and the constant-time floor function is available, then the expected O(n) space data structure was suggested that supports expected-time O(log n) insertions and deletions and constant query time. When modified for the algebraic decision tree model, insertions and deletions would require O(log2 n) expected time.[5] It is worth noting, though, that the complexity of the dynamic closest pair algorithm cited above is exponential in the dimension d, and therefore such an algorithm becomes less suitable for high-dimensional problems.

[edit]References

  1. ^ a b M. I. Shamos and D. Hoey. "Closest-point problems." In Proc. 16th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 151—162, 1975 (DOI 10.1109/SFCS.1975.8)
  2. ^ S. Fortune and J.E. Hopcroft. "A note on Rabin’s nearest-neighbor algorithm." Information Processing Letters, 8(1), pp. 20—23, 1979
  3. ^ S. Khuller and Y. Matias. A simple randomized sieve algorithm for the closest-pair problem. Inf. Comput., 118(1):34—37,1995
  4. ^ Richard Lipton (24 September 2011). "Rabin Flips a Coin".
  5. ^ Mordecai Golin, Rajeev Raman, Christian Schwarz, Michiel Smid, "Randomized Data Structures For The Dynamic Closest-Pair Problem", SIAM J. Comput., vo. 26, no. 4, 1998, preliminary version reported at the 4th Annu. ACM-SIAM Symp. on Discrete Algorithms, pp. 301–310 (1993)
Posted in 算法学习, 维基转载, 计算几何 | Tagged , , | Leave a comment

Animated film expresses childish fun

Animated film expresses childish fun  http://www.imdb.com/title/tt0837562/

Cast overview

Adam Sandler

Dracula (voice)

Andy Samberg

Jonathan (voice)

Selena Gomez

Mavis (voice)

Kevin James

Frankenstein (voice)

Fran Drescher

Eunice (voice)

Steve Buscemi

Wayne (voice)

Molly Shannon

Wanda (voice)

David Spade

Griffin (voice)

CeeLo Green

Murray (voice)

Jon Lovitz

Quasimodo (voice)

Brian George

Suit of Armor (voice)

Luenell

Shrunken Heads (voice)

Brian Stack

Pilot (voice)

 

An animated movie with Adam Sandler voicing Dracula sounds about as irritating as it gets for anyone beyond the age of eight or for early fans of the actor’s comedy career. Hotel Transylvania, though, manages to be a harmlessly humorous romp enhanced by director Genndy Tartakovsky’s effort to push the animation to a manic, bubblegum level.

In the film, Count Dracula, in an effort to protect his daughter Mavis (Selena Gomez) from the outside world, creates Hotel Transylvania so that monsters can relax in a sanctuary from the human world that they believe still fears and despises them. These creatures include Frankenstein’s Monster (Kevin James), the Wolfman, Wayne (Steve Buscemi) and the Mummy (CeeLo Green).

Many of these characters are back at the hotel for Mavis’ 118th birthday; the “young” vampiress yearns to explore the outside world, even though her father believes that human civilization would only destroy her. Yet in the aftermath of Dracula’s elaborate ruse, a human named Jonathan (Andy Samberg) ends up at the hotel, threatening the safety of the monsters and the reputation of the hotel. The plot, of course, thickens when Jonathan and Mavis begin to form a mutual attraction to one another.

If the film carries a glaring fault, it’s that this kind of father-daughter[c1]  relationship has been explored again and again in other films, particularly animated ones. And thus, the film becomes predictable; if you’ve seen The Little Mermaid or Finding Nemo (with a similarly protective father-son relationship), you’ll have a very good idea of where this story is going the moment Mavis and Jonathan lay eyes on each other.

But outside  the film’s hackneyed plot, Hotel Transylvania moves along briskly.

The voice actors do a fairly good job; Sandler isn’t nearly as annoying as you would expect, and there are moments where he gives the vampiric lord an air of grace and understanding. Still, most of the other voice actors have been featured in his other films, so Hotel Transylvania feels like the cast of a typical Adam Sandler comedy — just with legendary horror creatures. The familiar cast gives off a strange feeling of déjà vu[c2] , but it makes the friendship between the characters seem all the more legitimate.

The look and feel of the film boasts the style of the director, Genndy Tartakovsky. Tartakovsky, the creator of Dexter’s Laboratory and Samurai Jack, was the sixth director brought onto the project, which Sony Pictures Animation has struggled with since 2006. The studio allowed Tartakovsky to rewrite the script and push and exaggerate the 3-D animation. The film’s visual style now resembles classic Tex Avery cartoons rather than the realistic images of Pixar.

Though the style doesn’t differ much from the plastic look of Sony Animation’s previous film, Cloudy with a Chance of Meatballs, it definitely keeps the spirit of the simplified approach Tartakovsky is known for. The facial expressions are almost spot-on character takes from Dexter’s Laboratory, and considering Tartakovsky hasn’t done too many projects recently, it’s nice to see him back in the spotlight.

But when the film kicks into high gear, the elasticity and squash-and-stretch technique used for the characters really push the limits of animation techniques. Hotel Transylvania’s form of exaggeration and experimentation is refreshing to see in a major animated feature — especially since recent films have focused too much on how good the graphics look as opposed to what advancements they can contribute to the animation industry.

It must be said, however, that if you really aren’t familiar with Tartakovsky or don’t pay attention to animation styles, this film might be a bit too thin. Hotel Transylvania remains enjoyable on its own terms, but it caters more to children than adults who appreciate more sly, sophiscated comedy.

Hotel Transylvania’s hyperactive humor often covers very familiar ground.  The film itself makes commentary in the third act on how these horror legends have become so familiar to the public that the monsters resemble celebrities more than creatures worth fearing.  And Hotel Transylvania does manage to avoid relying too much on pop culture references, except for one blatant — yet still very funny — throwaway gag about the Twilightfranchise.

But animation buffs and the young-at-heart should definitely check out Hotel Transylvania. And if the film proves successful, it’ll be great to see what Tartakovsky will do next in animation.

Would a Samurai Jack movie be too much to ask for?

 


 [c1]Jnathan isn’t Mavis father, Dracula is Mavis father.

 [c2]

dé·jà vu

n.

1. Psychology The illusion of having already experienced something actually being experienced for the first time.

2.

a. An impression of having seen or experienced something before: Old-timers watched the stock-market crash with a distinct sense of déjà vu.

b. Dull familiarity; monotony: the déjà vu of the tabloid headlines.

 

Posted in 网络转载 | Tagged , | 1 Comment

Random variable

Random variable

From Wikipedia, the free encyclopedia

In probability and statistics, a random variable or stochastic variable is a variable whose value is subject to variations due to chance (i.e. randomness, in a mathematical sense). As opposed to other mathematical variables, a random variable conceptually does not have a single, fixed value (even if unknown); rather, it can take on a set of possible different values, each with an associated probability.

A random variable’s possible values might represent the possible outcomes of a yet-to-be-performed experiment or an event that has not happened yet, or the potential values of a past experiment or event whose already-existing value is uncertain (e.g. as a result of incomplete information or imprecise measurements). They may also conceptually represent either the results of an “objectively” random process (e.g. rolling a die), or the “subjective” randomness that results from incomplete knowledge of a quantity. The meaning of the probabilities assigned to the potential values of a random variable is not part of probability theory itself, but instead related to philosophical arguments over the interpretation of probability. The mathematics works the same regardless of the particular interpretation in use.

Random variables can be classified as either discrete (i.e. it may assume any of a specified list of exact values) or as continuous (i.e. it may assume any numerical value in an interval or collection of intervals). The mathematical function describing the possible values of a random variable and their associated probabilities is known as a probability distribution. The realizations of a random variable, i.e. the results of randomly choosing values according to the variable’s probability distribution are called random variates.

The basic concept of “random variable” in statistics is real-valued. However, one can consider arbitrary types such as boolean valuescategorical variablescomplex numbersvectorsmatrices,sequencestreessetsshapesmanifoldsfunctions, and processes. The term random element is used to encompass all such related concepts. An example is the stochastic process, a set of indexed random variables (typically indexed by time or space). These more general concepts are particularly useful in fields such as computer science and natural language processing where many of the basic elements of analysis are non-numerical. Such general random elements can sometimes be treated as sets of real-valued random variables — often more specifically as random vectors). For example:

  • A “random word” may be parameterized by an integer-valued index into the vocabulary of possible words; or alternatively as an indicator vector, in which exactly one element is a 1 and the others are 0, with the 1 indexing a particular word into a vocabulary.
  • A “random sentence” may be parameterized as a vector of random words.
  • random graph, for a graph with V edges, may be parameterized as an NxN matrix, indicating the weight for each edge, or 0 for no edge. (If the graph has no weights, 1 indicates an edge, 0 indicates no edge.)

However, reduction to numerical values is not essential for dealing with random elements: a randomly selected individual remains an individual, not a number.

The formal mathematical treatment of random variables is dealt with in the subject of probability theory. In that context, random variables are defined in terms of functions defined on a probability space.

Contents

[hide]

[edit]Introduction

Real-valued random variables (those whose range is the real numbers) are used in the sciences to make predictions based on data obtained from scientific experiments.[citation needed] In addition to scientific applications, random variables were developed for the analysis of games of chance and stochastic events. In such instances, the function that maps the outcome to a real number is often the identity function or similarly trivial function, and not explicitly described. In many cases, however, it is useful to consider random variables that are functions of other random variables, and then the mapping function included in the definition of a random variable becomes important. As an example, the square of a random variable distributed according to a standard normal distribution is itself a random variable, with a chi-squared distribution. One way to think of this is to imagine generating a large number of samples from a standard normal distribution, squaring each one, and plotting a histogram of the values observed. With enough samples, the graph of the histogram will approximate the density function of a chi-squared distribution with one degree of freedom.

Another example is the sample mean, which is the average of a number of samples. When these samples are independent observations of the same random event they can be called independent identically distributed random variables. Since each sample is a random variable, the sample mean is a function of random variables and hence a random variable itself, whose distribution can be computed and properties determined.

One of the reasons that real-valued random variables are so commonly considered is that the expected value (a type of average) and variance (a measure of the “spread”, or extent to which the values are dispersed) of the variable can be computed.[citation needed]

There are several types of random variables, the most common two are the discrete and the continuous.[1] A discrete random variable maps outcomes to values of a countable set (e.g., theintegers), with each value in the range having probability greater than or equal to zero. A continuous random variable maps outcomes to values of an uncountable set (e.g., the real numbers). For a continuous random variable, the probability of any specific value is zero, whereas the probability of some infinite set of values (such as an interval of non-zero length) may be positive. A random variable can be “mixed”, with part of its probability spread out over an interval like a typical continuous variable, and part of it concentrated on particular values like a discrete variable. These classifications are equivalent to the categorization of probability distributions.

The expected value of random vectorsrandom matrices, and similar aggregates of fixed structure is defined as the aggregation of the expected value computed over each individual element. The concept of “variance of a random vector” is normally expressed through a covariance matrix. No generally-agreed-upon definition of expected value or variance exists for cases other than just discussed.

[edit]Definition

A random variable is defined on a set of possible outcomes (the sample space Ω) and a probability distribution that associates each outcome with a probability. A random variable represents a measurable aspect or property of the outcomes, and hence associates each outcome with a number. In an experiment a person may be chosen at random, and one random variable may be its age, and another its number of children. Formally a random variable is considered to be a function on the possible outcomes. Random variables are typically classified as either discrete or continuous. Discrete variables can take on either a finite or at most a countably infinite set of discrete values. Their probability distribution is given by a probability mass function which directly maps a value of the random variable to a probability. Continuous variables, however, take on values that vary continuously within one or more (possibly infinite) intervals. As a result there are anuncountably infinite number of individual outcomes, and each has a probability 0. As a result, the probability distribution for many continuous random variables is defined using a probability density function, which indicates the “density” of probability in a small neighborhood around a given value. More technically, the probability that an outcome is in a particular range is derived from theintegration of the probability density function in that range. Both concepts can be united using a cumulative distribution function (CDF), which describes the probability that an outcome will be less than or equal to a specified value. This function is necessarily monotonically non-decreasing, with a minimum value of 0 at negative infinity and a maximum value of 1 at positive infinity. The CDF of a discrete distribution will consist mostly of flat areas along with sudden jumps at each outcome defined in the sample space, while the CDF of a continuous distribution will typically rise gradually and continuously. Distributions that are partly discrete and partly continuous can also be described this way.

[edit]Examples

In a poll an adult person \omega is chosen at random from the British population \Omega. The random variable A is the age of the chosen person, and the random variable C the number of children:

A:\Omega \to \N, \omega \mapsto A(\omega)=\text{age of }\omega,
C:\Omega \to \N, \omega \mapsto C(\omega)=\text{number of children of }\omega.

The possible outcomes for one coin toss can be described by the sample space \Omega = {heads, tails}. We can introduce a real-valued random variable Y that models a $1 payoff for a successful bet on heads as follows:

<br />
    Y(\omega) = \begin{cases}<br />
          1, & \text{if} \ \ \omega = \text{heads} ,\\<br />
          0, & \text{if} \ \ \omega = \text{tails} .<br />
        \end{cases}<br />

If the coin is equally likely to land on either side then it has a probability mass function given by:

\rho_Y(y) = \begin{cases}\frac{1}{2},& \text{if }y=1,\\<br />
\frac{1}{2},& \text{if }y=0.\end{cases}

A random variable can also be used to describe the process of rolling a die and the possible outcomes. The most obvious representation is to take the set \Omega = {1, 2, 3, 4, 5, 6} as the sample space, defining the random variable X to be equal to the number rolled. In this case,

X(\omega) = \omega

and

\rho_X(x) = \begin{cases}\frac{1}{6},& \text{if }x=1,2,3,4,5,6,\\</p>
<p>0,& \text{otherwise} .\end{cases}

An example of a continuous random variable would be one based on a spinner that can choose a horizontal direction. Then the values taken by the random variable are directions. We could represent these directions by North, West, East, South, Southeast, etc. However, it is commonly more convenient to map the sample space to a random variable which takes values which are real numbers. This can be done, for example, by mapping a direction to a bearing in degrees clockwise from North. The random variable then takes values which are real numbers from the interval [0, 360), with all parts of the range being "equally likely". In this case, X = the angle spun. Any real number has probability zero of being selected, but a positive probability can be assigned to anyrange of values. For example, the probability of choosing a number in [0, 180] is ½. Instead of speaking of a probability mass function, we say that the probability density of X is 1/360. The probability of a subset of [0, 360) can be calculated by multiplying the measure of the set by 1/360. In general, the probability of a set for a given continuous random variable can be calculated by integrating the density over the given set.

An example of a random variable of mixed type would be based on an experiment where a coin is flipped and the spinner is spun only if the result of the coin toss is heads. If the result is tails, X = −1; otherwise X = the value of the spinner as in the preceding example. There is a probability of ½ that this random variable will have the value −1. Other ranges of values would have half the probability of the last example.

[edit]Measure-theoretic definition

The most formal, axiomatic definition of random variables involves measure theory. Continuous random variables are defined in terms of sets of numbers, along with functions that map such sets to probabilities. Because of various difficulties (e.g. the Banach-Tarski paradox) that arise if such sets are insufficiently constrained, it is necessary to introduce what is termed a sigma-algebra to constrain the possible sets over which probabilities can be defined. Normally, a particular such sigma-algebra is used, the Borel σ-algebra, which allows for probabilities to be defined over any sets that can be derived either directly from continuous intervals of numbers or by a finite or countably infinite number of unions and/or intersections of such intervals.

The measure-theoretic definition is as follows.

Let (Ω, ℱ, P) be a probability space and (E, ℰ) a measurable space. Then an (E, ℰ)-valued random variable is a function X: Ω→E which is (ℱ, ℰ)-measurable. The latter means that, for every subset B ∈ ℰ, its preimage X −1(B) ∈ ℱ where X −1(B) = {ω: X(ω) ∈ B}.[2] This definition enables us to measure any subset B in the target space by looking at its preimage, which by assumption is measurable.

When E is a topological space, then the most common choice for the σ-algebra ℰ is to take it equal to the Borel σ-algebra ℬ(E), which is the σ-algebra generated by the collection of all open sets in E. In such case the (E, ℰ)-valued random variable is called the E-valued random variable. Moreover, when space E is the real line ℝ, then such real-valued random variable is called simply the random variable.

[edit]Real-valued random variables

In this case the observation space is the real numbers. Recall, (\Omega, \mathcal{F}, P) is the probability space. For real observation space, the function X\colon \Omega \rightarrow \mathbb{R} is a real-valued random variable if

\{ \omega : X(\omega) \le r \} \in \mathcal{F} \qquad \forall r \in \mathbb{R}

This definition is a special case of the above because the set \{(-\infty, r]: r \in \R\} generates the Borel sigma-algebra on the real numbers, and it suffices to check measurability on any generating set. Here we can prove measurability on this generating set by using the fact that \{ \omega : X(\omega) \le r \} = X^{-1}((-\infty, r]).

[edit]Distribution functions of random variables

If a random variable X\colon \Omega \to \mathbb{R} defined on the probability space (\Omega, \mathcal{F}, P) is given, we can ask questions like “How likely is it that the value of X is bigger than 2?”. This is the same as the probability of the event \{ \omega : X(\omega) > 2 \}\,\!  which is often written as P(X > 2)\,\! for short.

Recording all these probabilities of output ranges of a real-valued random variable X yields the probability distribution of X. The probability distribution “forgets” about the particular probability space used to define X and only records the probabilities of various values of X. Such a probability distribution can always be captured by its cumulative distribution function

F_X(x) = \operatorname{P}(X \le x)

and sometimes also using a probability density function. In measure-theoretic terms, we use the random variable X to “push-forward” the measure P on Ω to a measure dF on R. The underlying probability space Ω is a technical device used to guarantee the existence of random variables, sometimes to construct them, and to define notions such as correlation and dependence orindependence based on a joint distribution of two or more random variables on the same probability space. In practice, one often disposes of the space Ω altogether and just puts a measure on Rthat assigns measure 1 to the whole real line, i.e., one works with probability distributions instead of random variables.

[edit]Moments

The probability distribution of a random variable is often characterised by a small number of parameters, which also have a practical interpretation. For example, it is often enough to know what its “average value” is. This is captured by the mathematical concept of expected value of a random variable, denoted E[X], and also called the first moment. In general, E[f(X)] is not equal to f(E[X]). Once the “average value” is known, one could then ask how far from this average value the values of X typically are, a question that is answered by the variance and standard deviation of a random variable. E[X] can be viewed intuitively as an average obtained from an infinite population, the members of which are particular evaluations of X.

Mathematically, this is known as the (generalised) problem of moments: for a given class of random variables X, find a collection {fi} of functions such that the expectation values E[fi(X)] fully characterise the distribution of the random variable X.

Moments can only be defined for real-valued functions of random variables. If the random variable is itself real-valued, then moments of the variable itself can be taken, which are equivalent to moments of the identity function f(X)=X of the random variable. However, even for non-real-valued random variables, moments can be taken of real-valued functions of those variables. For example, for a categorical random variable X that can take on the nominal values “red”, “blue” or “green”, the real-valued function [X = \text{green}] can be constructed; this uses the Iverson bracket, and has the value 1 if X has the value “green”, 0 otherwise. Then, the expected value and other moments of this function can be determined.

[edit]Functions of random variables

A new random variable Y can be defined by applying a real Borel measurable function g\colon \mathbb{R} \rightarrow \mathbb{R} to the outcomes of a real-valued random variable X. The cumulative distribution function of Y\,\! is

F_Y(y) = \operatorname{P}(g(X) \le y).

If function g is invertible, i.e. g−1 exists, and increasing, then the previous relation can be extended to obtain

F_Y(y) = \operatorname{P}(g(X) \le y) = \operatorname{P}(X \le g^{-1}(y)) = F_X(g^{-1}(y))

and, again with the same hypotheses of invertibility of g, assuming also differentiability, we can find the relation between the probability density functions by differentiating both sides with respect to y, in order to obtain

f_Y(y) = f_X(g^{-1}(y)) \left| \frac{d g^{-1}(y)}{d y} \right| .

If there is no invertibility of g but each y admits at most a countable number of roots (i.e. a finite, or countably infinite, number of xi such that y = g(xi)) then the previous relation between theprobability density functions can be generalized with

f_Y(y) = \sum_{i} f_X(g_{i}^{-1}(y)) \left| \frac{d g_{i}^{-1}(y)}{d y} \right|

where xi = gi-1(y). The formulas for densities do not demand g to be increasing.

In the measure-theoretic, axiomatic approach to probability, if we have a random variable X\!  on \Omega \,\!  and a Borel measurable function g\colon \mathbb{R} \rightarrow \mathbb{R}, then Y = g(X)\,\!  will also be a random variable on \Omega\,\! , since the composition of measurable functions is also measurable. (However, this is not true if g is Lebesgue measurable.) The same procedure that allowed one to go from a probability space (\Omega, P)\,\!  to (\mathbb{R}, dF_{X}) can be used to obtain the distribution of Y\,\! .

[edit]Example 1

Let X be a real-valued, continuous random variable and let Y = X2.

F_Y(y) = \operatorname{P}(X^2 \le y).

If y < 0, then P(X2 ≤ y) = 0, so

F_Y(y) = 0\qquad\hbox{if}\quad y < 0.

If y ≥ 0, then

\operatorname{P}(X^2 \le y) = \operatorname{P}(|X| \le \sqrt{y})<br />
 = \operatorname{P}(-\sqrt{y} \le  X \le \sqrt{y}),

so

F_Y(y) = F_X(\sqrt{y}) - F_X(-\sqrt{y})\qquad\hbox{if}\quad y \ge 0.

[edit]Example 2

Suppose \scriptstyle X is a random variable with a cumulative distribution

 F_{X}(x) = P(X \leq x) = \frac{1}{(1 + e^{-x})^{\theta}}

where \scriptstyle \theta > 0 is a fixed parameter. Consider the random variable  \scriptstyle Y = \mathrm{log}(1 + e^{-X}). Then,

 F_{Y}(y) = P(Y \leq y) = P(\mathrm{log}(1 + e^{-X}) \leq y) = P(X > -\mathrm{log}(e^{y} - 1)).\,

The last expression can be calculated in terms of the cumulative distribution of X, so

 F_{Y}(y) = 1 - F_{X}(-\mathrm{log}(e^{y} - 1)) \,
 = 1 - \frac{1}{(1 + e^{\mathrm{log}(e^{y} - 1)})^{\theta}}
 = 1 - \frac{1}{(1 + e^{y} - 1)^{\theta}}
 = 1 - e^{-y \theta}.\,

[edit]Example 3

Suppose \scriptstyle X is a random variable with a standard normal distribution, whose density is

 f_X(x) = \frac{1}{\sqrt{2\pi}}e^{-x^2/2}

Consider the random variable  \scriptstyle Y = X^2. We can find the density using the above formula for a change of variables:

f_Y(y) = \sum_{i} f_X(g_{i}^{-1}(y)) \left| \frac{d g_{i}^{-1}(y)}{d y} \right|

In this case the change is not monotonic, because every value of \scriptstyle Y has two corresponding values of \scriptstyle X (one positive and negative). However, because of symmetry, both halves will transform identically, i.e.

f_Y(y) = 2f_X(g^{-1}(y)) \left| \frac{d g^{-1}(y)}{d y} \right|

The inverse transformation is

x = g^{-1}(y) = \sqrt{y}

and its derivative is

\frac{d g^{-1}(y)}{d y} = \frac{1}{2\sqrt{y}} .

Then:

<br />
\begin{align}<br />
f_Y(y) &= 2\frac{1}{\sqrt{2\pi}}e^{-y/2} \frac{1}{2\sqrt{y}} \\<br />
&= \frac{1}{\sqrt{2\pi y}}e^{-y/2}<br />
\end{align}<br />

This is a chi-squared distribution with one degree of freedom.

[edit]Equivalence of random variables

There are several different senses in which random variables can be considered to be equivalent. Two random variables can be equal, equal almost surely, or equal in distribution.

In increasing order of strength, the precise definition of these notions of equivalence is given below.

[edit]Equality in distribution

If the sample space is a subset of the real line a possible definition is that random variables X and Y are equal in distribution if they have the same distribution functions:

\operatorname{P}(X \le x) = \operatorname{P}(Y \le x)\quad\hbox{for all}\quad x.

Two random variables having equal moment generating functions have the same distribution. This provides, for example, a useful method of checking equality of certain functions of i.i.d. random variables. However, the moment generating function exists only for distributions that have a defined Laplace transform.

[edit]Almost sure equality

Two random variables X and Y are equal almost surely if, and only if, the probability that they are different is zero:

\operatorname{P}(X \neq Y) = 0.

For all practical purposes in probability theory, this notion of equivalence is as strong as actual equality. It is associated to the following distance:

d_\infty(X,Y)=\mathrm{ess } \sup_\omega|X(\omega)-Y(\omega)|,

where “ess sup” represents the essential supremum in the sense of measure theory.

[edit]Equality

Finally, the two random variables X and Y are equal if they are equal as functions on their measurable space:

X(\omega)=Y(\omega)\qquad\hbox{for all }\omega.

[edit]Convergence

A significant theme in mathematical statistics consists of obtaining convergence results for certain sequences of random variables; for instance the law of large numbers and the central limit theorem.

There are various senses in which a sequence (Xn) of random variables can converge to a random variable X. These are explained in the article on convergence of random variables.

[edit]See also

[edit]References

  1. ^ Rice, John (1999). Mathematical Statistics and Data Analysis. Duxbury Press. ISBN 0-534-20934-3.
  2. ^ Fristedt & Gray (1996, page 11)
This article includes a list of references, but its sources remain unclear because it has insufficient inline citations. Please help toimprove this article by introducing more precise citations. (February 2012)

[edit]Literature

  • Fristedt, Bert; Gray, Lawrence (1996). A modern approach to probability theory. Boston: Birkhäuser. ISBN 3-7643-3807-5.
  • Kallenberg, O. (1986) Random Measures, 4th edition. Academic Press, New York, London; Akademie-Verlag, Berlin. MR0854102 ISBN 0-12-394960-2
  • Kallenberg, O. (2001) Foundations of Modern Probability, 2nd edition. Springer-Verlag, New York, Berlin, Heidelberg). ISBN 0-387-95313-2
  • Papoulis, Athanasios (1965) Probability, Random Variables, and Stochastic Processes. McGraw–Hill Kogakusha, Tokyo, 9th edition, ISBN 0-07-119981-0.

[edit]External links

Posted in 维基转载 | Tagged | Leave a comment

Plasmid

Plasmid

From Wikipedia, the free encyclopedia

For the physics phenomena, see plasmoid. For the computer user interface, see KDE Plasma Workspaces.

Figure 1: Illustration of a bacterium with plasmid enclosed showing genomic DNA and plasmids.

In microbiology and genetics, a plasmid is a DNA molecule that is separate from, and can replicate independently of, the chromosomal DNA.[1] They are double-stranded and, in many cases, circular. Plasmids usually occur naturally in bacteria, but are sometimes found ineukaryotic organisms (e.g., the 2-micrometre ring in Saccharomyces cerevisiae).

Plasmid sizes vary from 1 to over 1,000 kbp. The number of identical plasmids in a single cell can range anywhere from one to thousands under some circumstances. Plasmids can be considered part of the mobilome because they are often associated with conjugation, a mechanism of horizontal gene transfer.

The term plasmid was first introduced by the American molecular biologist Joshua Lederberg in 1952.[2]

Plasmids are considered replicons, capable of replicating autonomously within a suitable host. Plasmids can be found in all three majordomains: Archaea, Bacteria, and Eukarya.[1] Similar to viruses, plasmids are not considered by some to be a form of life.[3] Unlike viruses, plasmids are naked DNA and do not encode genes necessary to encase the genetic material for transfer to a new host, though some classes of plasmids encode the sex pilus necessary for their own transfer. Plasmid host-to-host transfer requires direct, mechanical transfer by conjugation or changes in host gene expression allowing the intentional uptake of the genetic element by transformation.[1] Microbial transformation with plasmid DNA is neither parasitic nor symbiotic in nature, because each implies the presence of an independent species living in a commensal or detrimental state with the host organism. Rather, plasmids provide a mechanism for horizontal gene transfer within a population of microbes and typically provide a selective advantage under a given environmental state. Plasmids may carry genes that provide resistance to naturally occurring antibiotics in a competitive environmental niche, or the proteins produced may act as toxins under similar circumstances. Plasmids can also provide bacteria with the ability to fix elemental nitrogen or to degrade recalcitrant organic compounds that provide an advantage when nutrients are scarce.[1]

Contents

  [hide

[edit]Vectors

There are two types of plasmid integration into a host bacteria: Non-integrating plasmids replicate as with the top instance, whereas episomes, the lower example, integrate into the host chromosome.

Further information: Vector (molecular biology)

Plasmids used in genetic engineering are called vectors. Plasmids serve as important tools in genetics and biotechnology labs, where they are commonly used to multiply (make many copies of) or express particular genes.[4]Many plasmids are commercially available for such uses. The gene to be replicated is inserted into copies of a plasmid containing genes that make cells resistant to particular antibiotics and a multiple cloning site (MCS, or polylinker), which is a short region containing several commonly used restriction sites allowing the easy insertion of DNA fragments at this location. Next, the plasmids are inserted into bacteria by a process called transformation. Then, the bacteria are exposed to the particular antibiotics. Only bacteria that take up copies of the plasmid survive, since the plasmid makes them resistant. In particular, the protecting genes are expressed (used to make a protein) and the expressed protein breaks down the antibiotics. In this way, the antibiotics act as a filter to select only the modified bacteria. Now these bacteria can be grown in large amounts, harvested, and lysed (often using the alkaline lysis method) to isolate the plasmid of interest.

Another major use of plasmids is to make large amounts of proteins. In this case, researchers grow bacteria containing a plasmid harboring the gene of interest. Just as the bacterium produces proteins to confer its antibiotic resistance, it can also be induced to produce large amounts of proteins from the inserted gene. This is a cheap and easy way of mass-producing a gene or the protein it then codes for, for example, insulin or even antibiotics.

However, a plasmid can contain inserts of only about 1–10 kbp. To clone longer lengths of DNA, lambda phage with lysogeny genes deleted, cosmids, bacterial artificial chromosomes, or yeast artificial chromosomes are used.

[edit]Applications

[edit]Disease models

Plasmids were historically used to genetically engineer the embryonic stem cells of rats in order to create rat genetic disease models. The limited efficiency of plasmid-based techniques precluded their use in the creation of more accurate human cell models. However, developments in Adeno-associated virus recombination techniques, and Zinc finger nucleases, have enabled the creation of a new generation of isogenic human disease models.

[edit]Gene therapy

The success of some strategies of gene therapy depends on the efficient insertion of therapeutic genes at the appropriate chromosomal target sites within the human genome, without causing cell injury, oncogenic mutations (cancer) or an immune response. Plasmid vectors are one of many approaches that could be used for this purpose. Zinc finger nucleases (ZFNs) offer a way to cause a site-specific double-strand break to the DNA genome and cause homologous recombination. This makes targeted gene correction a possibility in human cells. Plasmids encoding ZFN could be used to deliver a therapeutic gene to a pre-selected chromosomal site with a frequency higher than that of random integration. Although the practicality of this approach to gene therapy has yet to be proven, some aspects of it could be less problematic than the alternative viral-based delivery of therapeutic genes.[5]

[edit]Episomes

Episomes are the eukaryotic equivalent of bacterial plasmids. In general, in eukaryotes, episomes are closed circular DNA molecules that are replicated in the nucleus. Viruses are the most common examples of this, such as herpesviruses, adenoviruses, and polyomaviruses. Other examples include aberrant chromosomal fragments, such as double minute chromosomes, that can arise during artificial gene amplifications or in pathologic processes (e.g., cancer cell transformation). Episomes in eukaryotes behave similarly to plasmids in prokaryotes in that the DNA is stably maintained and replicated with the host cell. Cytoplasmic viral episomes (as in poxvirus infections) can also occur. Some episomes, such as herpesviruses, replicate in a rolling circle mechanism, similar to bacterial phage viruses. Others replicate through a bidirectional replication mechanism (Theta type plasmids). In either case, episomes remain physically separate from host cell chromosomes. Several cancer viruses, including Epstein-Barr virus and Kaposi’s sarcoma-associated herpesvirus, are maintained as latent, chromosomally-distinct episomes in cancer cells, where the viruses express oncogenes that promote cancer cell proliferation. In cancers, these episomes passively replicate together with host chromosomes when the cell divides. When these viral episomes initiate lytic replication to generate multiple virus particles, they in general activate cellular innate immunity defense mechanisms that kill the host cell.

[edit]Types

Overview of bacterial conjugation

Electron micrograph of a DNA fiber bundle, presumably of a single bacterial chromosome loop.

Electron micrograph of a bacterial DNA plasmid (chromosome fragment).

One way of grouping plasmids is by their ability to transfer to other bacteria. Conjugative plasmids contain tra genes, which perform the complex process of conjugation, the transfer of plasmids to another bacterium (Fig. 4). Non-conjugative plasmids are incapable of initiating conjugation, hence they can be transferred only with the assistance of conjugative plasmids. An intermediate class of plasmids are mobilizable, and carry only a subset of the genes required for transfer. They can parasitize a conjugative plasmid, transferring at high frequency only in its presence. Plasmids are now being used to manipulate DNA and may possibly be a tool for curing many diseases.

It is possible for plasmids of different types to coexist in a single cell. Several different plasmids have been found in E. coli. However, related plasmids are often incompatible, in the sense that only one of them survives in the cell line, due to the regulation of vital plasmid functions. Thus, plasmids can be assigned into incompatibility groups.

Another way to classify plasmids is by function. There are five main classes:

  • Fertility F-plasmids, which contain tra genes. They are capable of conjugation and result in the expression of sex pilli.
  • Resistance (R)plasmids, which contain genes that can build a resistance against antibiotics or poisons and help bacteria produce to overcome the deleterious effects of antibiotics. Historically known as R-factors, before the nature of plasmids was understood.
  • Col plasmids, which contain genes that code for bacteriocins, proteins that can kill other bacteria.
  • Degradative plasmids, which enable the digestion of unusual substances, e.g. toluene and salicylic acid.
  • Virulence plasmids, which turn the bacterium into a pathogen.

Plasmids can belong to more than one of these functional groups.

Plasmids that exist only as one or a few copies in each bacterium are, upon cell division, in danger of being lost in one of the segregating bacteria. Such single-copy plasmids have systems that attempt to actively distribute a copy to both daughter cells. These systems are often referred to as thepartition system or partition function of a plasmid.

Some plasmids or microbial hosts include an addiction system or postsegregational killing system (PSK), such as the hok/sok (host killing/suppressor of killing) system of plasmid R1 in Escherichia coli.[6] This variant produces both a long-lived poison and a short-lived antidote. Several types of plasmid addiction systems (toxin/ antitoxin, metabolism-based, ORT systems) were described in the literature[7] and used in biotechnical (fermentation) or biomedical (vaccine therapy) applications. Daughter cells that retain a copy of the plasmid survive, while a daughter cell that fails to inherit the plasmid dies or suffers a reduced growth-rate because of the lingering poison from the parent cell. Finally, the overall productivity could be enhanced.

[edit]Yeast plasmids

Other types of plasmids are often related to yeast cloning vectors that include:

  • Yeast integrative plasmid (YIp), yeast vectors that rely on integration into the host chromosome for survival and replication, and are usually used when studying the functionality of a solo gene or when the gene is toxic. Also connected with the gene URA3, that codes an enzyme related to the biosynthesis of pyrimidine nucleotides (T, C);
  • Yeast Replicative Plasmid (YRp), which transport a sequence of chromosomal DNA that includes an origin of replication. These plasmids are less stable, as they can get lost during the budding.

[edit]Plasmid DNA extraction

As alluded to above, plasmids are often used to purify a specific sequence, since they can easily be purified away from the rest of the genome. For their use as vectors, and for molecular cloning, plasmids often need to be isolated.

There are several methods to isolate plasmid DNA from bacteria, the archetypes of which are the miniprep and the maxiprep/bulkprep.[4] The former can be used to quickly find out whether the plasmid is correct in any of several bacterial clones. The yield is a small amount of impure plasmid DNA, which is sufficient for analysis by restriction digest and for some cloning techniques.

In the latter, much larger volumes of bacterial suspension are grown from which a maxi-prep can be performed. In essence, this is a scaled-up miniprep followed by additional purification. This results in relatively large amounts (several micrograms) of very pure plasmid DNA.

In recent times, many commercial kits have been created to perform plasmid extraction at various scales, purity, and levels of automation. Commercial services can prepare plasmid DNA at quoted prices below $300/mg in milligram quantities and $15/mg in gram quantities (early 2007).

[edit]Conformations

Plasmid DNA may appear in one of five conformations, which (for a given size) run at different speeds in a gel during electrophoresis. The conformations are listed below in order of electrophoretic mobility (speed for a given applied voltage) from slowest to fastest:

  • Nicked open-circular DNA has one strand cut.
  • Relaxed circular DNA is fully intact with both strands uncut, but has been enzymatically relaxed (supercoils removed). This can be modeled by letting a twisted extension cord unwind and relax and then plugging it into itself.
  • Linear DNA has free ends, either because both strands have been cut or because the DNA was linear in vivo. This can be modeled with an electrical extension cord that is not plugged into itself.
  • Supercoiled (or covalently closed-circular) DNA is fully intact with both strands uncut, and with an integral twist, resulting in a compact form. This can be modeled by twisting an extension cordand then plugging it into itself.
  • Supercoiled denatured DNA is like supercoiled DNA, but has unpaired regions that make it slightly less compact; this can result from excessive alkalinity during plasmid preparation.

The rate of migration for small linear fragments is directly proportional to the voltage applied at low voltages. At higher voltages, larger fragments migrate at continuously increasing yet different rates. Thus, the resolution of a gel decreases with increased voltage.

At a specified, low voltage, the migration rate of small linear DNA fragments is a function of their length. Large linear fragments (over 20 kb or so) migrate at a certain fixed rate regardless of length. This is because the molecules ‘resperate’, with the bulk of the molecule following the leading end through the gel matrix. Restriction digests are frequently used to analyse purified plasmids. These enzymes specifically break the DNA at certain short sequences. The resulting linear fragments form ‘bands’ after gel electrophoresis. It is possible to purify certain fragments by cutting the bands out of the gel and dissolving the gel to release the DNA fragments.

Because of its tight conformation, supercoiled DNA migrates faster through a gel than linear or open-circular DNA.

[edit]Simulation of plasmids

The use of plasmids as a technique in molecular biology is supported by bioinformatics software. These programmes record the DNA sequence of plasmid vectors, help to predict cut sites ofrestriction enzymes, and to plan manipulations. Examples of software packages that handle plasmid maps are pDraw32, Geneious, Lasergene, GeneConstructionKit, ApE, Clone Manager, andVector NTI.

[edit]See also

[edit]References

  1. ^ a b c d Lipps G (editor). (2008). Plasmids: Current Research and Future Trends. Caister Academic Press. ISBN 978-1-904455-35-6.
  2. ^ Lederberg J (1952). "Cell genetics and hereditary symbiosis". Physiol. Rev. 32 (4): 403–430.PMID 13003535.
  3. ^ Sinkovics, J; Harvath J, Horak A. (1998). "The Origin and evolution of viruses (a review)". Acta microbiologica et immunologica Hungarica (St. Joseph’s Hospital, Department of Medicine, University of South Florida College of Medicine, Tampa, FL, USA.: Akademiai Kiado) 45 (3–4): 349–90. ISSN 1217-8950. PMID 9873943.
  4. ^ a b Russell, David W.; Sambrook, Joseph (2001). Molecular cloning: a laboratory manual. Cold Spring Harbor, N.Y: Cold Spring Harbor Laboratory.
  5. ^ Kandavelou K; Chandrasegaran S (2008). "Plasmids for Gene Therapy". Plasmids: Current Research and Future Trends. Caister Academic Press. ISBN 978-1-904455-35-6.
  6. ^ Gerdes K, Rasmussen PB, Molin S (1986). "Unique type of plasmid maintenance function: postsegregational killing of plasmid-free cells". Proc. Natl. Acad. Sci. U.S.A. 83 (10): 3116.doi:10.1073/pnas.83.10.3116. PMC 323463. PMID 3517851.
  7. ^ Kroll J, Klinter S, Schneider C, Voß I, Steinbüchel A (2010). "Plasmid addiction systems: perspectives and applications in biotechnology". Microb Biotechnol. 3 (6): 634–657.doi:10.1111/j.1751-7915.2010.00170.x. PMID 21255361.
Posted in 维基转载 | Tagged , , , , , | Leave a comment

What is a contig?


http://staden.sourceforge.net/contig.html

What is a contig?

The need for the original use of the word contig has not diminished, but the current diversity of uses is causing confusion. Here we explain the origins of the term. The Sanger shotgun sequencing method relies on using computer programs to find matches to order sets of overlapping clones (Staden,R. "A strategy of DNA sequencing employing computer programs", Nucleic Acids Res. 7, 2601-2610, 1979). To aid discussion of this new class of object the word contig was invented (Staden,R. "A new computer method for the storage and manipulation of DNA gel reading data", Nucleic Acids Res. 8, 3673-3694 (1980)). This paper contained the following:

"Definition of a contig

In order to make it easier to talk about our data gained by the shotgun method of sequencing we have invented the word "contig". A contig is a set of gel readings that are related to one another by overlap of their sequences. All gel readings belong to one and only one contig, and each contig contains at least one gel reading. The gel readings in a contig can be summed to form a contiguous consensus sequence and the length of this sequence is the length of the contig."

This defines a contig to be a set of overlapping segments of DNA. It naturally allows a set to contain a single element (ready for further comparison). It also defines the length of a contig to be the length of the consensus sequence derived from it. In the light of the current confusion, note that consensus sequences and contigs are entirely different classes of object.

Its meaning was broadened when the same term was used by Coulson, A.R., Sulston, J.E., Brenner, S. and Karn, J., Proc. Natl. Acad. Sc. 83, 7821-7825, (1986) to define sets of overlapping fingerprint clones used in physical mapping of the nematode genome. Unfortunately they did not refer to the published definition, but in a footnote defined contig to be "groups of clones with contiguous nucleotide sequences". In the fingerprint comparison software they required their "groups of clones" to contain single clones otherwise they would miss overlaps. The length of the contigs could be estimated, but the Sanger Centre had to be created before it was possible to calculate their consensus sequences.

This was a natural and consistent extension to overlapping clones, but since then the value of the term has slowly been eroded, not helped by various dictionaries and encyclopedias. Here are two examples reported by confused correspondents.

Contig: A continuous sequence of DNA that has been assembled from overlapping cloned DNA fragments. The Encyclopedia of Molecular Biology (1994,Blackwell). No longer a set – merely a string of A,C,G,T, formerly known as "sequence".

Contig: One of a set of overlapping clones that represent a continuous region of DNA. Each contig is a genomic clone, usually in a cosmid or a yac. They are used in contig mapping. Contig mapping is a technique which relies on the use of overlapping clones, referred to as contigs. Oxford Dictionary of Biochemistry and Molecular Biology. i.e. a contig is a single clone that is part of a set of overlapping clones which is a contig.

Currently there are several contradictory defintions, often reducing the meaning to that contained in the Blackwell Encyclopedia of Molecular Biology, ie a replacement for "sequence". A web search will reveal many more. For examplehttp://www.ncbi.nlm.nih.gov/genome/guide/build.html which defines a contig as a "Contiguous sequence constructed from many clone sequences. It may include draft and finished sequence. It may contain sequence gaps (within a clone), but it does not include gaps between clones". i.e. another type of sequence.

Posted in 网络转载 | Tagged , | Leave a comment

Linear differential equation

Linear differential equation

From Wikipedia, the free encyclopedia

Differential equations

De template.svg

Scope[show]

Classification

Operations[show]

Attributes of variables[show]

Relation to processes[show]

Solutions

Solution topics[show]

Solution methods[show]

Mathematicians[show]

In mathematics, linear differential equations are differential equations that have solutions which can be added together to form other solutions. They can be ordinary or partial. The solutions to linear equations form a vector space (not the case with non-linear differential equations).

Contents

  [hide

[edit]Introduction

Linear differential equations are of the form

 Ly = f \,

where the differential operator L is a linear operator, y is the unknown function (such as a function of time y(t)), and the right hand side ƒ is a given function of the same nature as y (called the source term). For a function dependent on time we may write the equation more expressively as

 L y(t) = f(t) \,

and, even more precisely by bracketing

 L [y(t)] = f(t) \,

The linear operator L may be considered to be of the form[1]

L_n(y) \equiv \frac{d^n y}{dt^n} + A_1(t)\frac{d^{n-1}y}{dt^{n-1}} + \cdots + 
A_{n-1}(t)\frac{dy}{dt} + A_n(t)y \,

The linearity condition on L rules out operations such as taking the square of the derivative of y; but permits, for example, taking the second derivative of y. It is convenient to rewrite this equation in an operator form

 L_n(y) \equiv \left[\,D^n  + A_{1}(t)D^{n-1} + \cdots + A_{n-1}(t) D  + A_n(t)\right] y

where D is the differential operator d/dt (i.e. Dy = y’ , D2y = y",… ), and the An are given functions.

Such an equation is said to have order n, the index of the highest derivative of y that is involved.

A typical simple example is the linear differential equation used to model radioactive decay.[2] Let N(t) denote the number of radioactive atoms in some sample of material [3] at time t. Then for some constant k > 0, the number of radioactive atoms which decay can be modelled by

 \frac{dN}{dt} = -k N\,

If y is assumed to be a function of only one variable, one speaks about an ordinary differential equation, else the derivatives and their coefficients must be understood as (contracted) vectors, matrices or tensors of higher rank, and we have a (linear) partial differential equation.

The case where ƒ = 0 is called a homogeneous equation and its solutions are called complementary functions. It is particularly important to the solution of the general case, since any complementary function can be added to a solution of the inhomogeneous equation to give another solution (by a method traditionally called particular integral and complementary function). When the Ai are numbers, the equation is said to have constant coefficients.

[edit]Homogeneous equations with constant coefficients

Main article: Characteristic equation (calculus)

The first method of solving linear ordinary differential equations with constant coefficients is due to Euler, who realized that solutions have the form  e^{z x} , for possibly-complex values of z. The exponential function is one of the few functions that keep its shape after differentiation. In order for the sum of multiple derivatives of a function to sum up to zero, the derivatives must cancel each other out and the only way for them to do so is for the derivatives to have the same form as the initial function. Thus, to solve

y^{(n)} + A_{1}y^{(n-1)} + \cdots + A_{n}y = 0

we set y=e^{z x}, leading to

z^n e^{zx} + A_1 z^{n-1} e^{zx} + \cdots + A_n e^{zx} = 0.

Division by e zx gives the nth-order polynomial

F(z) = z^{n} + A_{1}z^{n-1} + \cdots + A_n = 0.\,

This algebraic equation F(z) = 0, is the characteristic equation considered later by Gaspard Monge and Augustin-Louis Cauchy.

Formally, the terms

y^{(k)}\quad\quad(k = 1, 2, \dots, n).

of the original differential equation are replaced by zk. Solving the polynomial gives n values of z, z1, …, zn. Substitution of any of those values for z into e zx gives a solution e zix. Since homogeneous linear differential equations obey thesuperposition principle, any linear combination of these functions also satisfies the differential equation.

When these roots are all distinct, we have n distinct solutions to the differential equation. It can be shown that these are linearly independent, by applying the Vandermonde determinant, and together they form a basis of the space of all solutions of the differential equation.

Examples

y''''-2y'''+2y''-2y'+y=0 \,

has the characteristic equation

z^4-2z^3+2z^2-2z+1=0. \,

This has zeroes, i, −i, and 1 (multiplicity 2). The solution basis is then

e^{ix} ,\, e^{-ix} ,\, e^x ,\, xe^x \,.

This corresponds to the real-valued solution basis

\cos x ,\, \sin x ,\, e^x ,\, xe^x \,.

The preceding gave a solution for the case when all zeros are distinct, that is, each has multiplicity 1. For the general case, if z is a (possibly complex)zero (or root) of F(z) having multiplicity m, then, for k\in\{0,1,\dots,m-1\} \,, y=x^ke^{zx} \, is a solution of the ODE. Applying this to all roots gives a collection of n distinct and linearly independent functions, where n is the degree of F(z). As before, these functions make up a basis of the solution space.

If the coefficients Ai of the differential equation are real, then real-valued solutions are generally preferable. Since non-real roots z then come inconjugate pairs, so do their corresponding basis functions xkezx, and the desired result is obtained by replacing each pair with their real-valued linear combinations Re(y) and Im(y), where y is one of the pair.

A case that involves complex roots can be solved with the aid of Euler’s formula.

[edit]Examples

Given y''-4y'+5y=0 \,. The characteristic equation is z^2-4z+5=0 \, which has roots 2+i and 2−i. Thus the solution basis \{y_1,y_2\} is \{e^{(2+i)x},e^{(2-i)x}\} \,. Now y is a solution if and only if y=c_1y_1+c_2y_2 \, for c_1,c_2\in\mathbb C.

Because the coefficients are real,

  • we are likely not interested in the complex solutions
  • our basis elements are mutual conjugates

The linear combinations

u_1=\mbox{Re}(y_1)=\frac{y_1+y_2}{2}=e^{2x}\cos(x) \, and
u_2=\mbox{Im}(y_1)=\frac{y_1-y_2}{2i}=e^{2x}\sin(x) \,

will give us a real basis in \{u_1,u_2\}.

[edit]Simple harmonic oscillator

The second order differential equation

 D^2 y = -k^2 y,

which represents a simple harmonic oscillator, can be restated as

 (D^2 + k^2) y = 0.

The expression in parenthesis can be factored out, yielding

 (D + i k) (D - i k) y = 0,

which has a pair of linearly independent solutions, one for

 (D - i k) y = 0

and another for

 (D + i k) y = 0.

The solutions are, respectively,

 y_0 = A_0 e^{i k x}

and

 y_1 = A_1 e^{-i k x}.

These solutions provide a basis for the two-dimensional solution space of the second order differential equation: meaning that linear combinations of these solutions will also be solutions. In particular, the following solutions can be constructed

 y_{0'} = {A_0 e^{i k x} + A_1 e^{-i k x} \over 2} = C_0 \cos (k x)

and

 y_{1'} = {A_0 e^{i k x} - A_1 e^{-i k x} \over 2 i} = C_1 \sin (k x).

These last two trigonometric solutions are linearly independent, so they can serve as another basis for the solution space, yielding the following general solution:

 y_H = C_0 \cos (k x) + C_1 \sin (k x).
[edit]Damped harmonic oscillator

Given the equation for the damped harmonic oscillator:

 \left(D^2 + {b \over m} D + \omega_0^2\right)  y =  0,

the expression in parentheses can be factored out: first obtain the characteristic equation by replacing D with λ. This equation must be satisfied for all y, thus:

 \lambda^2 + {b \over m} \lambda + \omega_0^2 = 0.

Solve using the quadratic formula:

 \lambda = {-b/m \pm \sqrt{b^2 / m^2 - 4 \omega_0^2} \over 2}.

Use these data to factor out the original differential equation:

 \left(D + {b \over 2 m} - \sqrt{{b^2 \over 4 m^2} - \omega_0^2} \right) \left(D + {b \over 2m} + \sqrt{{b^2 \over 4 m^2} - \omega_0^2}\right) y = 0.

This implies a pair of solutions, one corresponding to

 \left(D + {b \over 2 m} - \sqrt{{b^2 \over 4 m^2} - \omega_0^2} \right) y = 0

and another to

 \left(D + {b \over 2m} + \sqrt{{b^2 \over 4 m^2} - \omega_0^2}\right) y = 0

The solutions are, respectively,

 y_0 = A_0 e^{-\omega x + \sqrt{\omega^2 - \omega_0^2} x} = A_0 e^{-\omega x} e^{\sqrt{\omega^2 - \omega_0^2} x}

and

 y_1 = A_1 e^{-\omega x - \sqrt{\omega^2 - \omega_0^2} x} = A_1 e^{-\omega x} e^{-\sqrt{\omega^2 - \omega_0^2} x}

where ω = b / 2m. From this linearly independent pair of solutions can be constructed another linearly independent pair which thus serve as a basis for the two-dimensional solution space:

 y_H (A_0, A_1) (x) = \left(A_0 \sinh \sqrt{\omega^2 - \omega_0^2} x + A_1 \cosh \sqrt{\omega^2 - \omega_0^2} x\right) e^{-\omega x}.

However, if |ω| < |ω0| then it is preferable to get rid of the consequential imaginaries, expressing the general solution as

 y_H (A_0, A_1) (x) = \left(A_0 \sin \sqrt{\omega_0^2 - \omega^2} x + A_1 \cos \sqrt{\omega_0^2 - \omega^2} x\right) e^{-\omega x}.

This latter solution corresponds to the underdamped case, whereas the former one corresponds to the overdamped case: the solutions for the underdamped case oscillate whereas the solutions for the overdamped case do not.

[edit]Nonhomogeneous equation with constant coefficients

To obtain the solution to the nonhomogeneous equation (sometimes called inhomogeneous equation), find a particular integral yP(x) by either the method of undetermined coefficients or the method of variation of parameters; the general solution to the linear differential equation is the sum of the general solution of the related homogeneous equation and the particular integral. Or, when the initial conditions are set, use Laplace transform to obtain the particular solution directly.

Suppose we face

\frac {d^{n}y(x)} {dx^{n}} + A_{1}\frac {d^{n-1}y(x)} {dx^{n-1}} + \cdots + A_{n}y(x) = f(x).

For later convenience, define the characteristic polynomial

P(v)=v^n+A_1v^{n-1}+\cdots+A_n.

We find the solution basis \{y_1(x),y_2(x),\ldots,y_n(x)\} as in the homogeneous (f(x)=0) case. We now seek a particular integral yp(x) by the variation of parameters method. Let the coefficients of the linear combination be functions of x:

y_p(x) = u_1(x) y_1(x) + u_2(x) y_2(x) + \cdots + u_n(x) y_n(x).

For ease of notation we will drop the dependency on x (i.e. the various (x)). Using the operator notation D = d/dx, the ODE in question is P(D)y = f; so

f=P(D)y_p=P(D)(u_1y_1)+P(D)(u_2y_2)+\cdots+P(D)(u_ny_n).

With the constraints

0=u'_1y_1+u'_2y_2+\cdots+u'_ny_n
0=u'_1y'_1+u'_2y'_2+\cdots+u'_ny'_n
 \cdots
0=u'_1y^{(n-2)}_1+u'_2y^{(n-2)}_2+\cdots+u'_ny^{(n-2)}_n

the parameters commute out,

f=u_1P(D)y_1+u_2P(D)y_2+\cdots+u_nP(D)y_n+u'_1y^{(n-1)}_1+u'_2y^{(n-1)}_2+\cdots+u'_ny^{(n-1)}_n.

But P(D)yj = 0, therefore

f=u'_1y^{(n-1)}_1+u'_2y^{(n-1)}_2+\cdots+u'_ny^{(n-1)}_n.

This, with the constraints, gives a linear system in the u′j. This much can always be solved; in fact, combining Cramer’s rule with the Wronskian,

u'_j=(-1)^{n+j}\frac{W(y_1,\ldots,y_{j-1},y_{j+1}\ldots,y_n)_{0 \choose f}}{W(y_1,y_2,\ldots,y_n)}.

The rest is a matter of integrating u′j.

The particular integral is not unique; y_p+c_1y_1+\cdots+c_ny_n also satisfies the ODE for any set of constants cj.

[edit]Example

Suppose y''-4y'+5y=sin(kx). We take the solution basis found above \{e^{(2+i)x},e^{(2-i)x}\}.

W\,
= \begin{vmatrix}e^{(2+i)x}&e^{(2-i)x} \\ (2+i)e^{(2+i)x}&(2-i)e^{(2-i)x} \end{vmatrix}

=e^{4x}\begin{vmatrix}1&1\\ 2+i&2-i\end{vmatrix}

=-2ie^{4x}\,

u'_1\,
=\frac{1}{W}\begin{vmatrix}0&e^{(2-i)x}\\ \sin(kx)&(2-i)e^{(2-i)x}\end{vmatrix}

=-\frac{i}2\sin(kx)e^{(-2-i)x}

u'_2\,
=\frac{1}{W}\begin{vmatrix}e^{(2+i)x}&0\\ (2+i)e^{(2+i)x}&\sin(kx)\end{vmatrix}

 =\frac{i}{2}\sin(kx)e^{(-2+i)x}.

Using the list of integrals of exponential functions

u_1\,
=-\frac{i}{2}\int\sin(kx)e^{(-2-i)x}\,dx

=\frac{ie^{(-2-i)x}}{2(3+4i+k^2)}\left((2+i)\sin(kx)+k\cos(kx)\right)

u_2\,
=\frac i2\int\sin(kx)e^{(-2+i)x}\,dx

=\frac{ie^{(i-2)x}}{2(3-4i+k^2)}\left((i-2)\sin(kx)-k\cos(kx)\right).

And so

y_p\,
=\frac{i}{2(3+4i+k^2)}\left((2+i)\sin(kx)+k\cos(kx)\right)
+\frac{i}{2(3-4i+k^2)}\left((i-2)\sin(kx)-k\cos(kx)\right)

=\frac{(5-k^2)\sin(kx)+4k\cos(kx)}{(3+k^2)^2+16}.

(Notice that u1 and u2 had factors that canceled y1 and y2; that is typical.)

For interest’s sake, this ODE has a physical interpretation as a driven damped harmonic oscillator; yp represents the steady state, and c_1y_1+c_2y_2 is the transient.

[edit]Equation with variable coefficients

A linear ODE of order n with variable coefficients has the general form

p_{n}(x)y^{(n)}(x) + p_{n-1}(x) y^{(n-1)}(x) + \cdots + p_0(x) y(x) = r(x).
[edit]Examples

A simple example is the Cauchy–Euler equation often used in engineering

x^n y^{(n)}(x) + a_{n-1} x^{n-1} y^{(n-1)}(x) + \cdots + a_0 y(x) = 0.

[edit]First order equation

Examples

Solve the equation

y'(x)+3y(x)=2 \,

with the initial condition

y\left(0\right)=2. \,

Using the general solution method:

y=e^{-3x}\left(\int 2 e^{3x}\, dx + \kappa\right). \,

The indefinite integral is solved to give:

y=e^{-3x}\left(2/3 e^{3x} + \kappa\right). \,

Then we can reduce to:

y=2/3 + \kappa e^{-3x}. \,

where κ is 4/3 from the initial condition.

A linear ODE of order 1 with variable coefficients has the general form

Dy(x) + f(x) y(x) = g(x).

Where D is the differential operator. Equations of this form can be solved by multiplying the integrating factor

e^{\int f(x)\,dx}

throughout to obtain

 Dy(x)e^{\int f(x)\,dx}+f(x)y(x)e^{\int f(x)\,dx}=g(x)e^{\int f(x) \, dx},

which simplifies due to the product rule to

 D (y(x)e^{\int f(x)\,dx})=g(x)e^{\int f(x)\,dx}

which, on integrating both sides, yields

 y(x)e^{\int f(x)\,dx}=\int g(x)e^{\int f(x)\,dx} \,dx+c ~,
 y(x) = {\int g(x)e^{\int f(x)\,dx} \,dx+c \over e^{\int f(x)\,dx}} ~.

In other words: The solution of a first-order linear ODE

y'(x) + f(x) y(x) = g(x),

with coefficients that may or may not vary with x, is:

y=e^{-a(x)}\left(\int g(x) e^{a(x)}\, dx + \kappa\right)

where \kappa is the constant of integration, and

a(x)=\int{f(x)\,dx}.

A compact form of the general solution is (see J. Math. Chem. 48 (2010) 175):

 y(x) = \int_a^x \! {[y(a) \delta(t-a)+g(t)] e^{-\int_t^x \!f(u)du}\, dt}\,.

where \delta(x) is the generalized Dirac delta function.

[edit]Examples

Consider a first order differential equation with constant coefficients:

\frac{dy}{dx} + b y = 1.

This equation is particularly relevant to first order systems such as RC circuits and mass-damper systems.

In this case, p(x) = b, r(x) = 1.

Hence its solution is

y(x) = e^{-bx} \left( e^{bx}/b+ C \right) = 1/b + C e^{-bx} .

[edit]See also

[edit]External links

[edit]Notes

  1. ^ Gershenfeld 1999, p.9
  2. ^ Robinson 2004, p.5
  3. ^ Robinson 2004, p.7

[edit]References

  • Birkhoff, Garret and Rota, Gian-Carlo (1978), Ordinary Differential Equations, New York: John Wiley and Sons, Inc., ISBN 0-471-07411-X
  • Gershenfeld, Neil (1999), The Nature of Mathematical Modeling, Cambridge, UK.: Cambridge University Press, ISBN 978-0-521-57095-4
  • Robinson, James C. (2004), An Introduction to Ordinary Differential Equations, Cambridge, UK.: Cambridge University Press, ISBN 0-521-82650-0
Posted in 维基转载 | Tagged , , | Leave a comment

Master theorem

http://en.wikipedia.org/wiki/Master_theorem

Master theorem

From Wikipedia, the free encyclopedia

For a result in enumerative combinatorics, see MacMahon Master theorem.

In the analysis of algorithms, the master theorem provides a cookbook solution in asymptotic terms (using Big O notation) for recurrence relations of types that occur in the analysis of many divide and conquer algorithms. It was popularized by the canonical algorithms textbook Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein, in which it is both introduced and proved. Nevertheless, not all recurrence relations can be solved with the use of the master theorem; its generalizations include the Akra–Bazzi method.

Contents

  [hide

[edit]Introduction

Consider a problem that can be solved using a recursive algorithm such as the following:

 procedure T( n : size of problem ) defined as:
   if n < 1 then exit

Do work of amount f(n)

T(n/b) T(n/b) ...repeat for a total of a times... T(n/b) end procedure

In the above algorithm we are dividing the problem into a number of subproblems recursively, each subproblem being of size n/b. This can be visualized as building a call tree with each node of the tree as an instance of one recursive call and its child nodes being instances of subsequent calls. In the above example, each node would have a number of child nodes. Each node does an amount of work that corresponds to the size of the sub problem n passed to that instance of the recursive call and given by f(n). The total amount of work done by the entire tree is the sum of the work performed by all the nodes in the tree.

Algorithms such as above can be represented as a recurrence relation T(n) = a \; T\left(\frac{n}{b}\right) + f(n). This recursive relation can be successively substituted into itself and expanded to obtain expression for total amount of work done.[1]

The Master theorem allows us to easily calculate the running time of such a recursive algorithm in Θ-notation without doing an expansion of the recursive relation above.

[edit]Generic form

The master theorem concerns recurrence relations of the form:

T(n) = a \; T\!\left(\frac{n}{b}\right) + f(n)  \;\;\;\; \mbox{where} \;\; a \geq 1 \mbox{, } b > 1

In the application to the analysis of a recursive algorithm, the constants and function take on the following significance:

  • n is the size of the problem.
  • a is the number of subproblems in the recursion.
  • n/b is the size of each subproblem. (Here it is assumed that all subproblems are essentially the same size.)
  • f (n) is the cost of the work done outside the recursive calls, which includes the cost of dividing the problem and the cost of merging the solutions to the subproblems.

It is possible to determine an asymptotic tight bound in these three cases:

[edit]Case 1

[edit]Generic form

If f(n) = O\left( n^{\log_b (a) - \epsilon} \right) for some constant \epsilon > 0 (using Big O notation)

it follows that:

T(n) = \Theta\left( n^{\log_b a} \right)
[edit]Example
T(n) = 8 T\left(\frac{n}{2}\right) + 1000n^2

As one can see in the formula above, the variables get the following values:

a = 8, \, b = 2, \, f(n) = 1000n^2, \, \log_b a = \log_2 8 = 3

Now we have to check that the following equation holds:

f(n) = O\left( n^{\log_b a - \epsilon} \right)
1000n^2 = O\left( n^{3 - \epsilon} \right)

If we choose \epsilon = 1, we get:

1000n^2 = O\left( n^{3 - 1} \right) = O\left( n^{2} \right)

Since this equation holds, the first case of the master theorem applies to the given recurrence relation, thus resulting in the conclusion:

T(n) = \Theta\left( n^{\log_b a} \right)

If we insert the values from above, we finally get:

T(n) = \Theta\left( n^{3} \right)

Thus the given recurrence relation T(n) was in Θ(n3).

(This result is confirmed by the exact solution of the recurrence relation, which is T(n) = 1001 n^3 - 1000 n^2, assuming T(1) = 1).

[edit]Case 2

[edit]Generic form

If it is true, for some constant k ≥ 0, that:

f(n) = \Theta\left( n^{\log_b a} \log^{k} n \right)

it follows that:

T(n) = \Theta\left( n^{\log_b a} \log^{k+1} n \right)
[edit]Example

T(n) = 2 T\left(\frac{n}{2}\right) + 10n

As we can see in the formula above the variables get the following values:

a = 2, \, b = 2, \, k = 0, \, f(n) = 10n, \, \log_b a = \log_2 2 = 1

Now we have to check that the following equation holds (in this case k=0):

f(n) = \Theta\left( n^{\log_b a} \right)

If we insert the values from above, we get:

10n = \Theta\left( n^{1} \right) = \Theta\left( n \right)

Since this equation holds, the second case of the master theorem applies to the given recurrence relation, thus resulting in the conclusion:

T(n) = \Theta\left( n^{\log_b a} \log^{k+1} n\right)

If we insert the values from above, we finally get:

T(n) = \Theta\left( n \log n\right)

Thus the given recurrence relation T(n) was in Θ(n log n).

(This result is confirmed by the exact solution of the recurrence relation, which is T(n) = n + 10 n\log_2 n, assuming T(1) = 1.)

[edit]Case 3

[edit]Generic form

If it is true that:

f(n) = \Omega\left( n^{\log_b (a) + \epsilon} \right) for some constant \epsilon > 0

and if it is also true that:

a f\left( \frac{n}{b} \right) \le c f(n) for some constant c < 1 and sufficiently large n

it follows that:

T\left(n \right) = \Theta \left(f \left(n \right) \right)
[edit]Example
T(n) = 2 T\left(\frac{n}{2}\right) + n^2

As we can see in the formula above the variables get the following values:

a = 2, \, b = 2, \, f(n) = n^2, \, \log_b a = \log_2 2 = 1

Now we have to check that the following equation holds:

f(n) = \Omega\left( n^{\log_b a + \epsilon} \right)

If we insert the values from above, and choose \epsilon = 1, we get:

n^2 = \Omega\left( n^{1 + 1} \right) = \Omega\left( n^2 \right)

Since this equation holds, we have to check the second condition, namely if it is true that:

a f\left( \frac{n}{b} \right) \le c f(n)

If we insert once more the values from above, we get the number :

2 \left( \frac{n}{2} \right)^2 \le c n^2 \Leftrightarrow \frac{1}{2} n^2 \le cn^2

If we choose  c = \frac{1}{2}, it is true that:

 \frac{1}{2} n^2 \le \frac{1}{2} n^2  \forall n \ge 1

So it follows:

T \left(n \right) = \Theta \left(f \left(n \right) \right).

If we insert once more the necessary values, we get:

T \left(n \right) = \Theta \left(n^2 \right).

Thus the given recurrence relation T(n) was in Θ(n2), that complies with the f (n) of the original formula.

(This result is confirmed by the exact solution of the recurrence relation, which is T(n) = 2 n^2 - n, assuming T(1) = 1.)

[edit]Inadmissible equations

The following equations cannot be solved using the master theorem:[2]

  • T(n) = 2^nT\left (\frac{n}{2}\right )+n^n
    a is not a constant
  • T(n) = 2T\left (\frac{n}{2}\right )+\frac{n}{\log n}
    non-polynomial difference between f(n) and n^{\log_b a} (see below)
  • T(n) = 0.5T\left (\frac{n}{2}\right )+n
    a<1 cannot have less than one sub problem
  • T(n) = 64T\left (\frac{n}{8}\right )-n^2\log n
    f(n) is not positive
  • T(n) = T\left (\frac{n}{2}\right )+n(2-\cos n)
    case 3 but regularity violation.

In the second inadmissible example above, the difference between f(n) and n^{\log_b a} can be expressed with the ratio \frac{f(n)}{n^{\log_b a}} = \frac{\frac{n}{\log n}}{n^{log_2 2}} = \frac{n}{n \log n} = \frac{1}{\log n}. It is clear that \frac{1}{\log n} < n^\epsilon for any constant \epsilon > 0. Therefore, the difference is not polynomial and the Master Theorem does not apply.

[edit]See also

[edit]Application to common algorithms

Algorithm

Recurrence Relationship

Run time

Comment

Binary search

T(n) = T\left(\frac{n}{2}\right) + O(1)

O(\log(n))

Apply Master theorem where f(n)=n^c[3]

Binary tree traversal

T(n) = 2 T\left(\frac{n}{2}\right) + O(1)

O(n)

Apply Master theorem where f(n)=n^c[3]

Optimal Sorted Matrix Search

T(n) = 2 T\left(\frac{n}{2}\right) + O(\log(n))

O(n)

Apply Akra-Bazzi theorem for p=1 and g(u)=\log(u) to get \Theta(2n - \log(n))

Merge Sort

T(n) = 2 T\left(\frac{n}{2}\right) + O(n)

O(n\log(n))

[edit]Notes

  1. ^ Duke University, "Big-Oh for Recursive Functions: Recurrence Relations", http://www.cs.duke.edu/~ola/ap/recurrence.html
  2. ^ Massachusetts Institute of Technology (MIT), "Master Theorem: Practice Problems and Solutions", http://www.csail.mit.edu/~thies/6.046-web/master.pdf
  3. ^ a b Dartmouth College, http://www.math.dartmouth.edu/archive/m19w03/public_html/Section5-2.pdf

[edit]References

Posted in 维基转载 | Tagged , | Leave a comment