Mathematical Economics – Cal

Problem 3 

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Suppose the cost function of producing Q > 0 units of a commodity is C(Q)=aQ2 + bQ + c Where a, b, c are all constants.

(a) Find the critical value of Q that minimizes the average cost function, AC(Q) = C(Q)/Q (this is called the minimum efficient scale in microeconomics).  

(b) Find the marginal cost function MC(Q) = dC(Q)/dQ, and show that MC(Q) = AC(Q) at the critical value of Q you found in part (a).

Problem 4 

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Consider that a person has a utility of money, x, U(x) = ln(1+0.5x). For simplicity assume that we cannot have negative money, i.e. he can’t borrow, so that x0. He is offered to enter a bet where there are two possible payouts, $5 with a probability of 0.25, and $25 with a probability of 0.75.

(a) Is this a risk averse, risk neutral, or risk loving individual? How do you know? 

(b) If this were a fair game, what would the cost of the bet be? 

(c) How much should this bet cost so that this particular individual would be indifferent between making the bet or not? (Hint: remember that an individual is indifferent when the utilities of the two options are the same.) 

(d) Is the cost in part (c) higher or lower than if the bet was a fair game? Does this have to do anything with the person’s attitude towards risk that you mentioned in part (a)?

Problem 5   

Consider that we have a plantation of pines that currently have a value of $5,000. The value grows at a continuous rate of 4t1/4

(a) Write the expression for the present value, P V , of the plantation in terms of t and the interest rate, r.

(b)  Write the expression for the optimal time, t*, to cut and sell the pine timber as a function of the interest rate, r.
      

(c)  Check the second-order condition for a maximum at the optimal value of t* . Does it hold, knowing that r > 0?
      

(d)  Assume That R=0.04. What is the value of t* and of PV*̊ ?
      

Problem 6

Consider the function f(x) =2/(3x+1)              
(a)  We’re going to consider a 2nd-order Taylor expansion around the point x = 2. What is the 2nd-order polynomial that approximates f(x)? That is, find the expression for P2  in this case.
   

(b)  What is the general form of the Lagrange remainder for this case? That is, find the expression for R2  in this case. (Hint: this is a function of x and x*)     

(c) Now consider that x =4. What is the value of f(4)? 

(d)  What is the value of P2 that you found in part (a) when evaluated at x= 4?

(e)  What is, then, the value of the remainder R2, when x =4? (Hint: This is an actual number not a function of x* .)

(f) What is the value of x*  that will make the function and the full expansion have the same value at x= 4? 

Problem 7

Consider the function Y=[(x-7)x]2

(a) At what value(s) of x is it possible that we have a local maximum or minimum, or an inflection point? 

(b) Use the general test we have seen in section 11.1 to determine whether we have a maximum, minimum, or inflection point, for each of the critical values you found in part (a).

Chapter

2

Review of Univariate Calculus

Alfonso Sánchez-Peñalve

r

University of South Florid

a

alfonsos

1

@usf.edu

Class notes for

Introduction to Mathematical Economics

1

mailto:alfonsos1@usf.edu

Contents

1

  • Functions
  • 4

    1.1 Revenue, Cost, and Profit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    6

    2

  • Change and Rate of Change, Differential and Derivative
  • 8

    2.1 Differential and Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    10

    2.2 Derivative Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    2

    2.2.1 Constant Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    12

    2.2.2 Power Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
    2.2.

    3

    Logarithm Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    13

    2.2.4 Exponential Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    3

    2.2.

    5

    Summation Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
    2.2.6 Generalized Power Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    4

    2.2.

    7

    Product of Constant and Function Rule . . . . . . . . . . . . . . . . . . .

    14

    2.2.8 Difference Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    5

    2.2.

    9

    Product Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    15

    2.2.10 Quotient Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    16

    2.2.

    11

    Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    6

    3

  • The Elasticity
  • 17

    4

  • Continuous, Differentiable, and Continuously Differentiable Functions
  • 19

    5

  • Marginal and Average Functions
  • 23

    5.1 Marginal and Average Revenue . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    24

    5.2 Marginal and Average Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    25

    6

  • Higher Order Derivatives
  • 27

    6.1 The Second Derivative and Concavity and Convexity . . . . . . . . . . . . . . . .

    29

    6.2 Attitude Towards Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    33

    7

  • Maxima and Minima
  • 34

    7.1

  • Conditions for a Local Maximum or Minimum
  • . . . . . . . . . . . . . . . . . . . .

    36

    8

  • Profit Maximization
  • 39

    9

  • Optimal Timing
  • 41

    10

  • Approximating Functions
  • 45

    10.1 Macluarin Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
    10.2 Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    47

    10.3 The Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    49

    10.4 The Lagrange Form of the Remainder . . . . . . . . . . . . . . . . . . . . . . . . .

    5

    0

    11 Conditions for a Local Maximum or Minimum

    52

    2

    11.1 General Test at a Critical Point . . . . . . . . . . . . . . . . . . . . . . . . . . . .

    55

    12

  • Homework Problems
  • 56

  • References
  • 59

    3

    In this chapter we’re going to review the concepts of univariate (one variable) calculus that you
    ought to know by now, and probably introduce new ones you will need

    .

    1 Functions

    We start by remembering what a function is. We say that y is a function of x, and denote it as
    y “ fpxq, if fp q assigns a unique value of y to each x it takes.

    Example 1

    Consider y “ 2x. This function simply doubles the value of x. The following are just some
    of the pairings that the function will create

    x

    y

    0 0
    1 2
    2 4
    3 6
    4

    8

    Notice that in the previous example I have said that the function creates pairings. In fact, a
    function is a special kind of what is called a relation. A relation will create pair of values between
    x and y, but a relation does not guarantee that each value of x is assigned only one value of y

    ,

    whereas a function does.

    Example 2

    To see the difference between a function and a relation, consider y “
    ?
    x. The following

    table shows the possible pairings formed by this relation

    x y

    81 9
    81 -9
    64 8
    64 -8
    49 7
    49 -7
    36 6
    36 -6

    Notice that y “
    ?
    x assigns two different values to each x, and therefore is not a function.

    Since it does does create paired values, it is then a relation.

    4

    The variable that is an input is called the function’s argument. In our examples it has been x,
    which is why we have that fpxq is usually read f of x, to indicate that y is a function of x. As
    we have seen, a function with one argument produces pairs of values. These pairs of values are
    actual points in a 2-dimensional plane. It is 2 dimensions because the argument represents one
    dimension, and the resulting variable the other.

    1 2 3 4 5 6

    2
    4
    6
    8
    10

    x

    y
    1 2 3 4 5 6
    2
    4
    6
    8
    10
    x
    y

    (a) Points (b) Curve

    Figure 1: Points and Curve in a 2-Dimensional Space

    To illustrate this better, consider again the function in example 1, and the pairs of values we
    have in that example. Usually a 2-dimensional space is represented by cartesian diagrams, where
    the argument is placed in the horizontal axis, and the resulting variable is placed in the vertical
    axis. In figure 1a you can see the pairs of values in example 1 plotted in a cartesian diagram.
    Notice that the value of each of the variables is the value in one of the two dimensions,

    and

    the pair of values denote a point in the space. The points themselves don’t quite represent the
    function, but remember that a function is basically the set of all possible pairs it generates.

    Two important characteristics of a function are: domain and range. A function’s domain is the
    set of values that its argument can take. The function’s argument is the variable the function
    takes to transform into a different value. A function’s range is the set of values that the
    transformed values can take. Notice that the domain of the function in example 1 is the set of
    real numbers R, as is its range. This means that all possible pairs generated by this function
    are really one next to each other. We can thus represent the function, which is nothing but the
    whole set of pairs that it generates, by a curve. In this case the curve is a straight line, as shown
    in figure 1b, for a few non-negative values of its domain.

    Example 3

    Consider the function y “ 2´ x
    x2 ´ 4.

    5

    The domain of the function is the set of real numbers R except 2 and ´2 (since both return 4
    and makes the denominator 0 and the function indeterminate), and the range of the function
    is the set of real numbers R. The domain of the function are all the values that x can take
    that will return a determinate value of y, and the range of the function is the set of values
    that y can take.

    1.1 Revenue, Cost, and Profit

    In microeconomics we usually express revenue, cost, and, thus, profit, as functions of output.
    Consider

    P “

    50

    ´ 10Q`

    Q

    2

    2
    (1)

    In equation (1) we have an example of what is called inverse demand function, where we express
    the price as a function of output. It is what you saw in your principles of microeconomics as the
    demand curve. We call it the inverse demand function because the demand function actually
    expresses quantity as a function of price.

    Now that we have the inverse demand function we can get the revenue function because, as you
    know, revenue is price times quantity. We have, then, that

    R “ P ˆQ “

    ˆ

    50´ 10Q` Q
    2
    2

    ˙

    Q

    50Q´ 10Q2 ` Q
    3

    2
    . (2)

    2 4 6 8 10

    10

    20

    30

    40

    50

    (4,

    18

    )

    Q

    P

    1 2 3 4 5 6 7

    20

    40

    60

    80

    100

    (4,72)

    Q

    R

    (a) Inverse Demand (b) Revenue

    Figure 2: The Inverse Demand and Revenue Functions

    Before looking at the cost function, I want you to think of the relationship between the inverse
    demand function and the revenue function, because this can become handy in your more advanced

    6

    microeconomics courses. To do that look at figure 2, where graph figure 2a presents the inverse
    demand function in equation (1), and figure 2b the revenue function in equation (2). Notice that
    revenue is nothing but price times quantity. This means that at an output of 4, where the price
    is 18 (check it with equation (1)), the revenue is the area of the rectangle formed by the grey
    dashed lines in graph (a) and the two axes. That is because the horizontal distance, 4, is the
    quantity, and the vertical distance, 18, is the price. Clearly the revenue is, then, 18 ˆ 4 “ 72.
    This is shown in figure 2b where at an output of 4, revenue is 72. One way to think about this
    is that the price is actually the average revenue, as long as each unit produced is sold at the
    same price. This may be true in monopolistic competition for the most part, but when we deal
    with markets with fewer firms where there could be price discrimination, the price will not be
    the average revenue. For each group of clients, though, the price they pay is the average revenue
    from that group.

    Consider now the following cost function

    C “ 10` 50Q´

    22

    Q2 ` 7
    2
    Q3. (3)

    You should remember from your principles courses that the part of cost that doesn’t change with
    output is called fixed cost, FC, and the part of the cost that varies with output is variable cost,
    V C. You should also remember that fixed cost only exists in the short run, when there are some
    fixed factors of production that we can’t change. In the long run all costs are variable. From all
    this we can deduct that the cost function in equation (3) is a short run cost function, and that

    FC “ 10 (4)

    V C “ 50Q´ 22Q2 ` 7
    2
    Q3 (5)

    Given the revenue function in equation (2) and the cost function in equation (3) we can derive
    the profit function. Remember that profit, which in microeconomics is usually referred to with
    the letter π, is nothing but revenue minus cost. Therefore

    π “ R ´ C “ 50Q´ 10Q2 ` Q
    3

    2
    ´
    ˆ

    10` 50Q´ 22Q2 ` 7

    2
    Q3

    ˙
    “ 50Q´ 10Q2 ` Q
    3

    2
    ´ 10´ 50Q` 22Q2 ´ 7

    2
    Q3

    “ ´10` 12Q2 ´ 3Q3 (6)

    To illustrate the relationship between revenue, cost, and profit, and what we are actually doing
    when we’re subtracting a function from another, consider figure 3. Graph (a) presents the
    revenue and cost curves together, and graph (b) the resulting profit function. When we subtract
    a function from another with the same argument, what we’re actually doing is capturing the
    vertical distance between the function we are subtracting from, and the function we subtract, in
    all the points on the domain of both functions. Consider, then, what happens when Q “ 2. If

    7

    1 2 3 4 5 6 7
    20
    40
    60
    80
    100
    R

    C

    (2,64)

    (2,50)

    Q

    R,C

    1 2 3 4

    ´10

    10
    20

    (2,14)

    Q

    π

    (a) Revenue and Cost (b) Profit

    Figure 3: Revenue, Cost, and Profit Functions

    you substitute this quantity in equations (2) and (3), you will find that the revenue when selling
    2 units is 64, and the cost of producing 2 units is 50. This is reflected by the height of each
    respective curve from the horizontal axis in graph (a), where Q “ 2. Consequently, the profit
    is 64 ´ 50 “ 14, the difference between the height of both curves, which is exactly what we see
    the height of the profit function in graph (b) where Q “ 2. So profit increases when the height
    of the revenue curve increases relative to the height of the cost curve (or the height of the cost
    curve decreases relative to the height of the revenue curve), it decreases when the height of the
    revenue curve decreases relative to the height of the cost curve (or the height of the cost curve
    increases relative to the height of the revenue curve), and it remains unchanged when the height
    of either curve doesn’t change relative to each other (they can both increase or decrease, but
    they do so at the same rate, or they can both stay at the same height).

    2 Change and Rate of Change, Differential and Deriva-

    tive

    When we have that a variable is a function of another, like y is a function of x for example,
    any change in y can really only be caused by a change in x. We usually let the greek symbol ∆
    (capital delta) represent change, such that ∆y represents the change in y. If we have two pairs

    y1 “ fpx1q
    y2 “ fpx2q,

    where the subscripts 1 and 2 are to denote that there are two different values of x and y, we
    have that

    ∆y “ y2 ´ y1 “ fpx2q ´ fpx1q.

    8

    Letting ∆x “ x2 ´ x1, so that x2 “ x1 `∆x, we have that

    ∆y “ fpx1 `∆xq ´ fpx1q. (7)

    Notice that ∆y ‰ fp∆xq, but rather the difference between the two transformed values at each
    of the values of x. Clearly ∆y is a function of ∆x but only because x2 “ x1`∆x, and y2 “ fpx2q.

    y
    x

    fpxq

    y1

    y2

    x1 x2

    ∆x

    ∆y

    Figure 4: Change and Rate of Change

    Figure 4 illustrates this concept. Notice that ∆x “ x2´ x1, the distance between the two values
    in the horizontal axis. Similarly, ∆y “ y2 ´ y1 “ fpx2q ´ fpx1q. By increasing x from x1 to x2,
    we’re moving along fpxq from y1 “ fpx1q to y2 “ fpx2q.

    Was the effect of x on y large or not? When calculating ∆y we don’t know how strong an effect
    x has on y, we only know how much y changed. The amount that y changed will depend on two
    things:

    1. how much x has changed, i.e. ∆x, and

    2. how strong an influence x has on y between x1 and x2.

    The second measure is what we capture with what is called the rate of change of y in terms of
    x, which is defined as the change in y per unit change of x

    ∆y

    ∆x
    “ fpx2q ´ fpx1q

    x2 ´

    x1

    “ fpx1 `∆xq ´ fpx1q

    ∆x
    . (8)

    If you remember that the slope of a line is nothing but the rise over the run, and you look at
    figure 4 one more time, you will see that the rate of change in y in terms of x when x changed
    from x1 to x2 is nothing but the slope of the line that joins the two points on fpxq, i.e. fpx1q

    9

    and fpx2q (the purple line). So the rate of change between two points in a function, is the slope
    of the straight line that joins two points in the function. For the function in figure 4 we see that
    the line joining any two points is flatter for lower values of x and steeper for larger values of x,
    so the rate of change increases with x and x will have a larger effect on y, per unit of the change
    in x, the larger the value of x is.

    2.1 Differential and Derivative

    The differential of y is the change in y when the change in x is infinitesimal (extremely small),
    and the derivative of y with respect to x is the rate of change in y per unit of change in x when
    the change in x is infinitesimal. How do we define them? We make use of limits. That way, and
    remembering equation (7), the differential of y, dy is

    dy ” lim

    ∆xÑ0

    ∆y

    lim
    ∆xÑ0

    rfpx1 `∆xq ´ fpx1qs . (9)

    When you look at equation (9) you immediately think that the limit has to be 0. Clearly the
    smaller the change in x, the smaller the change in y. However, you should think of this as ∆x
    approaches 0, but it never reaches it. That means that there is a very small (infinitesimal) change
    in x, which causes the infinitesimal change in y. We will come back to this once we cover the
    derivative.

    Like we did for the differential, we define the derivative of y with respect to x, dy{dx, taking the
    limit of the rate of change in equation (8):

    dy

    dx
    ” lim

    ∆xÑ0
    ∆y

    ∆x
    “ lim

    ∆xÑ0

    fpx1 `∆xq ´ fpx1q
    ∆x

    . (10)

    When a variable is a function of just one variable, we can use any of the three following expressions
    to refer to the derivative

    dy

    dx
    ” f 1pxq ” y1.

    To better understand what the derivative represents, consider figure 5 where we see what happens
    to the rate of change as ∆x Ñ 0. Notice that as the increase in x decreases towards zero, the
    point on the curve comes closer and closer to the px1, y1q point. As that happens, the slope of
    the line joining the two dots, i.e. the rate of change, decreases. In the limit, the rate of change
    becomes the slope of the line that is tangent to fpxq at px1, y1q. Notice that the slope of the
    curve at px1, y1q is equal to the slope of the straight line that is tangent at that point. This
    means that the derivative of a function evaluated at a given value of x, e.g. f 1px1q, is the slope
    of the line that is tangent to fpxq at the point px1, fpx1qq and, consequently, the slope of fpxq
    at px1, fpx1qq. In fact, at any point on the curve fpxq the slope at that point will be the value
    of the derivative, f 1pxq, evaluated at the value of x at that point, which is why we say that the
    derivative of a function measures the slope of the function at its different points.

    10

    y
    x
    fpxq
    x1
    y1

    Figure 5: The Derivative and the Slope

    Getting back to the differential of a function, realizing that dx “ lim∆xÑ0 ∆x, and remembering
    that the limit of a product equals the product of the two limits,

    dy “ lim
    ∆xÑ0

    rfpx1 `∆xq ´ fpx1qs

    “ lim
    ∆xÑ0

    fpx1 `∆xq ´ fpx1q
    ∆x
    ∆x

    “ lim
    ∆xÑ0

    fpx1 `∆xq ´ fpx1q
    ∆x

    lim
    ∆xÑ0

    ∆x,

    which means that

    dy “ f 1pxqdx. (11)

    Looking at equation (11) you may be thinking that this doesn’t tell you anything new, since
    f 1pxq “ dy{dx, and if you multiply f 1pxq by dx you should, then, get dy, and in essence you’re
    right. What equation (11) is telling us, however, is that for very small (infinitesimal) changes in
    x, i.e. dx, the change in y is given by moving on the line that is tangent to fpxq at that point,
    since the rate of change (slope) on that straight line is given by f 1pxq. So for a small range of
    values around x1, for example, we can get the change in y, by using the slope at px1, fpx1qq,
    which is given by f 1px1q, and multiplying it by the small change in x, dx. For larger changes
    in x, so that ∆x doesn’t approach 0, we can use the derivative to calculate an approximate
    change in y, ∆y, but it will be an approximation not the actual value of ∆y. How good an
    approximation will depend on the curvature of fpxq at the point we’re evaluating the derivative,
    and the amount of ∆x. Looking once more at figure 5, consider what happens as we move away
    from x1 to the right. If we use the slope at px1, y1q to approximate the rate of change, we will
    be moving on the straight line that is tangent at px1, y1q. For small changes to the right the

    11

    curve and the line move very close to each other, but as the amount of ∆x increases, and we
    move further to the right, the vertical distance between fpxq and the tangent line increases, thus
    making the approximation using the derivative a poorer one.

    2.2 Derivative Rules

    Now that we have explored the concept of a derivative and a differential, we look at some rules
    that will allow us to get the derivative functions of many different functional forms.

    2.2.1 Derivative of a Constant

    If y ” fpxq “ a, where a is a constant (fixed number)

    , then

    y1 ” f 1pxq “ da
    dx
    “ 0. (12)

    Since a is a constant, as x changes a does not, so da “ 0 for any change in x. The derivative is,
    consequently 0.

    2.2.2 Derivative of a Power-Function

    If y ” fpxq “ xa, where a is a constant (fixed number), then

    y1 “ dx
    a

    dx
    “ axa´1. (13)

    The rule, then, is telling us that to get the derivative, we bring the exponent down, so we
    premultiply the variable by the exponent, and then subtract 1 from the exponent.

    Example 4

    Let y “ x, then

    y1 “ dx
    dx
    “ 1×1´1 “ 1.

    This should be expected, since y “ x any change in x equals the change in y and, thus, the
    rate of change is always 1.

    Let y “ x3, then

    y1 “ dx
    3

    dx
    “ 3×3´1 “ 3×2.

    Let y “ 1
    x3

    . Notice that this is the same as y “ x´3. Therefore

    y1 “ dx
    ´3

    dx
    “ ´3x´3´1 “ ´3x´4 “ ´ 3

    x

    4
    .

    12

    Let y “
    ?
    x. Notice that this is the same as y “ x1{2. Therefore

    y1 “ dx
    1{2

    dx
    “ 1

    2
    x1{2´1 “ x

    ´1{2

    2
    “ 1

    2×1{2
    “ 1

    2
    ?

    x
    .

    2.2.3 Derivative of a Logarithmic Function

    Let y “ loga x, where a is a constant and it’s the base of the logarithm. Then

    y1 “ d loga x
    dx

    “ 1
    x ln a

    , (14)

    where ln a is the natural logarithm of a, the base of the original logarithm. Remember that the
    base for the natural logarithm is the number e “ 2.718

    28

    . . . , which means that if y “ lnx, then

    y1 “ d lnx
    dx

    “ 1
    x ln e

    “ 1
    x
    . (15)

    2.2.4 Derivative of an Exponential Function

    Let y “ ax, where a is a constant and it’s the base of the exponential function. Then

    y1 “ da
    x

    dx
    “ ax ln a. (16)

    Again, since the base for the natural logarithm is the number e we have that when y “ ex

    y1 “ de
    x

    dx
    “ ex ln e “ ex. (17)

    2.2.5 Derivative of the Sum of Two Functions of the Same Variable

    Let fpxq and gpxq be any two functions of the same variable x, and let y “ fpxq ` gpxq. Then

    y1 “ d rfpxq ` gpxqs
    dx

    “ dfpxq
    dx

    ` dgpxq
    dx

    “ f 1pxq ` g1pxq. (18)

    Notice that the change in y will be given by the change that x causes in fpxq plus the change
    x causes in pxq, because y is the sum of both functions. Therefore, the derivative of the sum of
    two functions equals the sum of the derivatives of each of the functions.

    Example 5

    Let y “ x2 ` x5, then

    y1 “ d px
    2 ` x5q
    dx

    “ dx
    2

    dx
    ` dx

    5

    dx
    “ 2x` 5×4.

    13

    Let y “ 3×2. Notice that this can be expressed as y “ x2 ` x2 ` x2, so

    y1 “ d3x
    2

    dx
    “ d px

    2 ` x2 ` x2q
    dx

    “ dx
    2
    dx
    ` dx
    2
    dx
    ` dx
    2

    dx
    “ 3dx

    2

    dx
    “ 3ˆ 2x “ 6x.

    2.2.6 Derivative of a Power Generalized

    The second case in example 5 allows us to write a more generalized version of the power rule we
    saw in section 2.2.2. Let y “ axb, where a and b are both constants (fixed numbers). Then

    y1 “ dax

    b

    dx
    “ abxb´1. (19)

    This works like the power rule in equation (13) in that you bring the exponent, b, to the front of
    the variable and still subtract 1 from the exponent. Since the constant a was already multiplying
    you now have the constant ab (a times b) in front of the variable.

    Example 6

    Let y “ 5×2, then

    y1 “ d5x
    2

    dx
    “ 2ˆ 5×2´1 “ 10x.

    Let y “ ´3×4, then

    y1 “ d p´3x
    4q

    dx
    “ ´3ˆ 4×4´1 “ ´12×3.

    2.2.7 Derivative of the Product of a Constant and a Function of a Variable

    Let y “ afpxq, where a is a constant. Then

    y1 “ d rafpxqs
    dx

    “ adfpxq
    dx

    “ af 1pxq. (20)

    This is another corollary of the sum rule in section 2.2.5, and the generalized power rule in section
    2.2.6 is just a particular case of this rule. Once more, the result is a very logical one. Notice
    that y is a times fpxq so any change in x will first cause the change in fpxq and then multiply
    that change by a times. Therefore per unit of change in x, the rate of change will be af 1pxq.

    Example 7

    Let y “ 3 ¨ 2x, then

    y1 “ d p3 ¨ 2
    xq

    dx
    “ 3 ¨ 2x ¨ ln 2 “ 3 ln 2 ¨ 2x “ ln 8 ¨ 2x.

    14

    Let y “ 2 lnx, then

    y1 “ d p2 lnxq
    dx

    “ 2
    x
    .

    2.2.8 Derivative of the Difference of Two Functions of the Same Variable

    Let fpxq and gpxq be any two functions of the same variable x, and let y “ fpxq ´ gpxq. Then

    y1 “ d rfpxq ´ gpxqs
    dx

    “ dfpxq
    dx

    ´ dgpxq
    dx

    “ f 1pxq ´ g1pxq. (

    21

    )

    We see, then, that the derivative of a difference between two functions equals the difference
    between the derivatives of the respective functions. This follows from the derivative of the sum
    rule, and the fact that subtracting a function is the same as adding the function multiplied

    by

    the constant ´1.

    Example 8

    Let y “ 2×3 ´ 3 lnx, then

    y1 “ d p2x
    3 ´ 3 lnxq
    dx

    “ d p2x
    3q

    dx
    ´ d p3 lnxq

    dx
    “ 6×2 ´ 3

    x
    .

    2.2.9 Derivative of the Product of Two Functions of the Same Variable

    Let fpxq and gpxq be any two functions of the same variable x, and let y “ fpxq ¨ gpxq. Then

    y1 “ d rfpxq ¨ gpxqs
    dx

    “ f 1pxq ¨ gpxq ` fpxq ¨ g1pxq. (22)

    In other words the sum of the products of the derivative of each function times the other function
    without deriving.

    Example 9

    Let y “ p3x2 ` 2q ex, then

    y1 “ d rp3x
    2 ` 2q exs
    dx

    “ p6x` 0qex ` p3x2 ` 2qex “ p3x2 ` 6x` 2qex.

    Let y “ p3x2 ` 2q p5x3 ´ 2xq, then

    y1 “ d rp3x
    2 ` 2q p5x3 ´ 2xqs

    dx
    “ 6xp5x3 ´ 2xq ` p3x2 ` 2qp15x2 ´ 2q.

    15

    2.2.10 Derivative of the Quotient of Two Functions of the Same Variable

    Let fpxq and gpxq be any two functions of the same variable x, and let y “ fpxq{gpxq. Then

    y1 “
    d

    fpxq
    gpxq

    dx
    “ f

    1pxq ¨ gpxq ´ fpxq ¨ g1pxq
    rgpxqs2

    . (23)

    This tells us that the derivative of the quotient is a quotient itself, which in the numerator has
    the difference between the product of the derivative of the function in the numerator and the
    original function of the denominator and the product between the derivative of the function in
    the denominator times the original function in the numerator, and in the denominator has the
    original function in the denominator squared.

    Example 10

    Let y “ 3x
    2 ` 2
    ex

    , then
    y1 “
    d
    ˆ

    3×2 ` 2
    ex

    ˙

    dx
    “ 6xe

    x ´ p3x2 ` 2q ex

    pexq2
    “ e

    x p6x´ 3×2 ´ 2q
    e2x

    “ 6x´ 3x
    2 ´ 2

    ex
    .

    Now, let y “ x
    2

    . We know that this is the same as y “ 0.5x, so from equation (19) we know
    that y1 “ 0.5. We use the quotient rule to get the same result

    y1 “
    d
    ´x

    2

    ¯

    dx
    “ 1 ¨ 2´ x ¨ 0

    22
    “ 2

    4
    “ 0.5.

    2.2.11 Derivative of a Function of a Function of a Variable

    This rule is usually called the chain rule because you take the derivatives in chain from outside
    to inside. Let y “ fpxq and z “ gpyq. Notice that this really means that z “ hpxq where
    hpxq “ g rfpxqs is the composed functional form. Then

    z1 “ dz
    dx
    “ dz

    dy
    ¨ dy

    dx
    “ g1pyq ¨ f 1pxq “ g1 rfpxqs ¨ f 1pxq. (24)

    Notice, then, that the rule is telling us that we first take the derivative of the function that covers
    the other function, grfpxqs with respect to the whole inside function, and then multiply that by
    the derivative of the inside function with respect to the variable we want. Let’s see this in an
    example which will clarify this rule.

    16

    Example 11

    Let y “ p3x2 ` 3q2. Notice that we can let z “ 3×2 ` 3, so that y “ z2. According to the
    chain rule

    y1 “ dp3x
    2 ` 3q2

    dx
    “ dy

    dz
    ¨ dz

    dx
    “ 2z ¨ 6x “ 2

    `

    3×2 ` 3
    ˘

    6x “ 36×3 ` 36x.

    We can check this answer since we know that p3x2` 3q2 “ 9×4` 9` 18×2. Using the general
    power rule and the addition rule, we have

    y1 “ dp9x
    4 ` 9` 18x2q

    dx
    “ 36×3 ` 36x.

    Sometimes we cannot simplify into a function of just one variable. For example, let y “
    lnp3x2 ` 2xq. Here we can let z “ 3×2 ` 2x so that y “ ln z, and

    y1 “ d lnp3x
    2 ` 2xq

    dx
    “ dy
    dz
    ¨ dz

    dx
    “ 1
    z
    ¨ p6x` 2q “ 6x` 2

    3×2 ` 2x.

    3 The Elasticity

    Now that we have considered the concepts of changes, differentials, rates of change, and deriva-
    tives, we are ready to look at some applications in economics. A measure you should have been
    introduced to in your principles of economics courses is that of an elasticity. Let y “ fpxq, the
    elasticity of this function is defined by

    η “ %∆y
    %∆x

    . (25)

    A percentage change is defined as the difference between two values over the original value. That
    way

    %∆y “ y2 ´ y1
    y1

    “ ∆y
    y1
    “ fpx1 `∆xq ´ fpx1q

    fpx1q
    “ fpx1 `∆xq

    fpx1q
    ´ 1 (

    26

    )

    %∆x “ x2 ´ x1
    x1

    “ ∆x
    x1

    “ x2
    x1
    ´ 1. (27)

    Example 12

    Let y “ 3x´ 0.5×2. What is the elasticity between the points x1 “ 2 and x2 “ 3?

    We start with the percentage change in x Using equation (27)

    %∆x “ 3
    2
    ´ 1 “ 0.5,

    17

    so 50%. Now, to get the percentage change in y we have to get the two y values. These are

    y1 “ 3 ¨ 2´ 0.5 ¨ 22 “ 6´ 0.5 ¨ 4 “ 6´ 2 “ 4
    y2 “ 3 ¨ 3´ 0.5 ¨

    32

    “ 9´ 0.5 ¨ 9 “ 9´ 4.5 “ 4.5,

    so using equation (26)

    %∆y “ 4.5
    4
    ´ 1 “ 0.125,

    so 12.5%.

    This means that

    η “ %∆y
    %∆x

    “ 0.125
    0.5

    “ 0.25.

    Using equations (26) and (27) we can express the elasticity in equation (25) as

    η “ ∆y{y1
    ∆x{x1

    “ ∆y
    ∆x

    ¨ x1
    y1
    . (28)

    Equation (28) is telling us that the elasticity between two points on a function is the rate of
    change between the two points times the quotient with the original x point in the numerator and
    the original y1 “ fpx1q in the denominator. For infinitesimal (very small) changes in x, i.e. dx,
    we have that

    η “ dy{y
    dx{x “

    dy

    dx
    ¨ x
    y
    “ dy

    dx
    ¨ x
    fpxq . (29)

    Equation (29) expresses the elasticity of a function at a point. From equation (15) we know that
    d lnx

    dx
    “ 1
    x

    , so that d lnx “ dx
    x

    . So it must also be that d ln y “ dy
    y

    . Using these two expressions

    with equation (29) we have that

    η “ d ln y
    dx

    x “ d ln y
    d lnx

    . (30)

    Equations (29) and (30) are the expressions we will use throughout the course to get elasticities.
    Both are equivalent, and you can use either of them. We now look at some examples.

    Example 13

    Let y “ 2×3 ´ x2 ` 3. Using equation (29), we have that

    η “
    `

    6×2 ´ 2x
    ˘ x

    y
    “ 6x

    3 ´

    2×2

    2×3 ´ x2 ` 3 .

    Now, if we take the natural log ln y “ ln p2x3 ´ x2 ` 3q, in the right hand side we don’t have
    everything in terms of lnx, so we can’t take the derivative with respect to lnx. We can,
    however, take the derivative with respect to x and then multiply by x to get the elasticity.

    18

    So using the chain rule

    η “ 1
    2×3 ´ x2 ` 3p6x

    2 ´ 2xqx “ 6x
    3 ´ 2×2

    2×3 ´ x2 ` 3 .

    As you can see they both return the same expression.

    Example 14

    Let y “ 3
    2×2

    .

    Notice that in this case ln y “ ln 3´ ln 2×2 “ ln 3´ ln 2´ lnx2 “ ln 3´ ln 2´ 2 lnx. We now
    have the natural log of y in terms of the natural log of x, so we can derive the natural log of
    y with respect to the natural log of x to get the elasticity.

    η “ d ln y
    d lnx

    “ 0´ 0´ 2 “ ´2.

    Notice that we’re taking the derivative with respect to lnx. So that’s like if you let z “ lnx
    and take the derivative with respect to z. That is why you see that it is ´2, since that is
    what is multiplying lnx.

    If, instead, we wanted to use equation (29), notice that y “ 3
    2
    x´2, so

    η “ ´3x´3x
    y
    “ ´ 3

    x3
    x
    3

    2×2

    “ ´ 3
    x3

    2×3

    3
    “ ´2.

    As you can see we get the same result.

    4 Continuous, Differentiable, and Continuously Differen-

    tiable Functions

    In economics, when we do optimization, which we will cover throughout the course, we usually
    want to use continuously differentiable functions, because this allows for the models we build to
    work more smoothly. It is important, then, that we know exactly what this means. For that, we
    need to understand what a continuous function is, first, then when a function is differentiable,
    and then what a continuously differentiable function is and how each of these characteristics
    relates to the other. As usual, we assume that you’re already familiar with what a limit is, and
    use the concept here so that you have an exact definition.

    We start with the definition of a continuous function.

    A function fpxq is continuous at a point x1 in the domain of the function, so that

    19

    fpx1q is defined and determinate, if and only if limxÑx1 exits, i.e. is unique and finite,
    and it equals fpx1q.

    The definition tells us that the limit as we approach the point, x1, from either side is the same,
    that it is fpx1q, and that fpx1q is a finite and determinate value. This is extended to an interval.

    A function fpxq is continuous in the interval pa, bq, if it is continuous at all the
    points in the interval.

    Notice, then, that this means that all the points in the interval have to be in the domain of the
    function, and that the limit as x approaches each point in the interval from either side has to be
    the value of the function at that point. This means that a function is continuous in its domain
    if it’s continuous in all points of it’s domain. Notice that this doesn’t mean that the function is
    continuous in R. A function can only be continuous in al R if its domain is R, but that is not
    sufficient, because then at each point in R it needs to be continuous.

    y
    x
    fpxq
    y1
    y2
    x1

    Figure 6: A Discontinuous Function

    Notice that a function can be defined at a point, and thus the point be in its domain, but not
    continuous at a point. An example is given in figure 6 where fpxq is discontinuous at x1. Notice
    that x1 is in the domain of fpxq since fpx1q “ y2, because that is where the point is solid. The
    problem is that the limit at x1 is not unique. As we approach x1 from the left (lower values) the
    limit is y1, and as we approach x1 from the right (higher values), the limit is y2. The limit, thus,
    does not exist because it’s not unique, and the function is not continuous at x1.

    We now turn into the definition of differentiable, which is a pretty straight forward one

    A function fpxq is differentiable at a point x1 in the domain of the function, so
    that fpx1q is defined and determinate, if and only if the derivative of the function at

    20

    that point exists, i.e. f 1px1q is unique and determinate.

    Even though the definition may seem quite obvious, if we look at it more closely it will throw
    an important light on the relationship with whether a function is continuous or not. For that,
    consider again the definition of derivative in equation (10), and instead of using x2 “ x1 `∆x,
    let us use a general x for the other point, so we can express the derivative as

    dy

    dx
    ” f 1pxq “ lim

    xÑx1

    fpxq ´ fpx1q

    x´ x1

    lim
    xÑx1

    rfpxq ´ fpx1qs

    lim
    xÑx1

    px´ x1q

    lim
    xÑx1
    fpxq ´ fpx1q
    lim
    xÑx1

    x´ x1
    . (

    31

    )

    Since both fpx1q and x1 are constants, their respective limits are fpx1q and x1, and since the
    limit of a sum is the sum of the limits, we see that the only two limits we really need to consider
    are limxÑx1 fpxq and limxÑx1 x, but only the first one is related to the continuity of fpxq. We
    see that in order for the derivative to exist at x1 the limit of fpxq must exist at x1. We know
    that when a function is continuous at x1 the limit of the function exists and it’s equal to fpx1q.
    Consider that a function is discontinuous at x1, like the one in figure 10. It is clear that the slope
    is not defined at x1, because at that point there is a jump from y1 to y2, and the rate of change
    to the left of the point, or to the right of the point, is not the same as at the point. Remember
    that a derivative is the limit of the rate of change, and for a limit to exist it must be unique. It’s
    clear that the limit of the rate of change at x1 doesn’t exist, so the derivative doesn’t exist, even
    though the function is continuous at x1. This means that

    for a function fpxq to be differentiable at a point x1 in the domain of the function,
    it is necessary but not sufficient for the function to be continuous at x1.

    That is, the function needs to be continuous to be differentiable, but it is not enough. So all
    differentiable functions at a point are continuous at that point, but not all continuous functions
    at a point are differentiable at that point. Usually continuous functions with jumps at a point, or
    a kink (change of direction or slope) at a point, are the examples of functions that are continuous
    at that point but not differentiable. A prime example is the absolute value of x. I leave it to you
    to draw it and see why it is not differentiable at 0, although continuous.

    Finally, we consider what a continuously differentiable function is.

    A function fpxq is continuously differentiable if, and only if, the derivative of the
    function, f 1pxq, is continuous in the domain of the primitive function fpxq.

    For this to happen, fpxq must be continuous in its domain, and must have no jumps or kinks
    in its domain, so that it is differentiable in its domain. In addition, the derivative must be
    continuous in the domain of the primitive function. Now, if a point is not in the domain of the
    original function, that doesn’t stop the function from being continuously differentiable, since the
    rule must hold for all the points in the domain. So if there is a value of x, x0, for which fpx0q
    is not defined, such that x0 is not in the domain of the function, that doesn’t mean that fpxq is
    not continuously differentiable.

    21

    Example 15

    Let y “ 2x
    2

    x2 ´ 1.

    y

    x
    ´1 1

    The graph above represents the function. Notice that, clearly, the function is discontinuous
    at x “ ´1 and x “ 1. Furthermore, the function is indeterminate at those two points,
    so those two points are not in the domain of the function. The domain of the function is
    p´8,´1q Y p´1, 1q Y p1,`8q. The function is not differentiable only at those two points,
    basically because the function itself is discontinuous at those two points. The function,
    however, is continuously differentiable. To see this, let’s get the derivative. Using the
    quotient rule:

    y1 “ 4xpx
    2 ´ 1q ´ 2×2 ¨ 2x
    px2 ´ 1q2 “

    4×3 ´ 4x´ 4×3

    x4 ´ 2×2 ` 1 “
    ´4x

    x4 ´ 2×2 ` 1 .

    22

    y1
    x
    ´1 1

    The graph now presents the derivative function. We see that the derivative is also discontin-
    uous at x “ ´1 and x “ 1. This is because y is not continuous and, thus, not differentiable
    at those points. Since those two points are not in the domain of y, however, y is continuously
    differentiable because the derivative, y1, is continuous at all the points in the domain of y.

    5 Relationship between a Marginal Function and an Av-

    erage Function

    You should have already been introduced to some marginal functions and average functions in
    your principles of economics courses. In economics the marginal function is the derivative of the
    function. As we shall see marginal revenue is the derivative of revenue with respect to output, and
    marginal cost is the derivative of cost with respect to output. We now explore the relationship
    between the marginal (derivative) function and the average function.

    One thing that you may have not realized yet is that the value that a function yields at a certain
    point is the sum of each value up to that point. This is illustrated more easily with a straight
    line because it allows us to see this with increments of ∆x “ 1, but it works for every function

    23

    with smaller increments, i.e. dx.1 Consider, then,

    y “ 2` 0.5x, (32)

    which means that y1 “ 0.5. To pick a start point, let x “ 0 so yp0q “ 2. At x “ 4 yp4q “
    2` 0.5 ¨ 4 “ 4. This is because the first unit of x brought 0.5 units of y, and so did the second,
    third, and fourth units. So in total we added 4 ¨ 0.5 “ 2 units from x “ 0 to x “ 4. Since it’s
    a straight line, each additional unit bring the same amount, because the marginal function is
    constant. When we don’t have a straight line, remember equation (11)

    dy “ f 1pxqdx

    Each small increment in y is the derivative times the small increment in x. So as we move up
    the curve in small and small increments, we keep adding the f 1pxqdx to the original point, to get
    to y.

    The average function is simply given by y{x, i.e. the amount of y per unit of x, at any given
    point. Using equation (32) we have that

    y

    x
    “ 2
    x
    ` 0.5. (33)

    Consider the right hand side. When will the average increase as we increase x? Notice that as
    x increases the average decreases, since 2{x becomes smaller. Notice that the derivative of the
    average function is ´2{x2, which is negative. What is happening is that the value of the marginal
    function at any value of x is always smaller than the average function’s value, so as we increment
    the x the average keeps decreasing. Let’s see this, and remember that the marginal function is
    always equal to 0.5. We start at x “ 1 (since 2/0 is indeterminate) and see that the average is
    2{1` 0.5 “ 2.5. Since we’re going to add 0.5 to y as we move to x “ 2, notice that we’re adding
    a value that is less than the average, so the average at x “ 2 must be smaller. We check that,
    and 2{2` 0.5 “ 1.5 confirms this. In fact we have that

    when the value of the marginal function is below that of the average function,
    as we increase x the average function’s value decreases, and when the value of
    the marginal function is above that of the average function, as we increase x
    the average function’s value increases.

    5.1 Marginal and Average Revenue

    We now consider the marginal and average revenue functions using the total revenue function
    we saw in section 1.1, given by equation (2)

    R “ 50Q´ 10Q2 ` Q
    3

    2
    .

    1This is, in fact, the concept of an integral that you would see in more advanced calculus courses.

    24

    The average revenue function is R{Q. Notice that this is going to equal the price function in
    equation (1), since R “ P ¨Q, so AR “ pP ¨Qq{Q “ P . Therefore

    AR ” R
    Q

    50Q´ 10Q2 ` Q
    3

    2
    Q

    “ 50´ 10Q` Q
    2

    2
    , (34)

    which is exactly the same expression as in equation (1).

    The marginal revenue function is the derivative of the revenue function with respect to quantity,
    so

    MR ” dR
    dQ

    “ 50´ 20Q` 3
    2
    Q2. (

    35

    )

    1 2 3 4 5 6 7

    ´20

    20
    40
    60

    AR

    MR

    Q

    AR,MR

    Figure 7: Average and Marginal Revenue

    Figure 7 graphs equations (34) and (35) for a few values of Q so that we can explore the
    relationship. We see that except at the origin, i.e. where Q “ 0, the MR is always below the
    AR, which makes the AR decrease all the time, as we have explored before. Remember that the
    AR curve is exactly the inverse demand that we saw in figure 2 (a). Another thing of interest is
    that if you check figure 2 (b), the revenue curve changes the sign of its slope when Q is slightly
    greater than 3, since the revenue curve changes from increasing to decreasing. You can see this
    also in figure 7, because that is where the MR changes from positive to negative. This is because
    the MR is the slope of the revenue function.

    5.2 Marginal and Average Cost

    We now turn our attention to the cost side, also using the cost function we saw in section 1.1,
    given by equation (3)

    C “ 10` 50Q´ 22Q2 ` 7
    2
    Q3.

    25

    Remember that this led to equations (4) and (5)

    FC “ 10

    V C “ 50Q´ 22Q2 ` 7
    2
    Q3.

    We start looking at the average cost, which is nothing but the cost function divided by output,
    so

    AC “ 10
    Q
    ` 50´ 22Q` 7

    2
    Q2, (36)

    which can be broken up between average fixed cost and average variable cost as

    AFC “ 10
    Q

    (

    37

    )

    AV C “ 50´ 22Q` 7
    2
    Q2. (

    38

    )

    Let us now get the expression for the marginal cost, which is nothing but the derivative of the
    cost function with respect to output, so

    MC “ 50´

    44

    Q` 21
    2
    Q2. (39)

    If you pay attention you should see that it doesn’t matter whether we take the derivative of the
    cost function or of the variable cost function to get the marginal cost because, by definition, the
    fixed cost is constant and doesn’t change with output. This means that the marginal cost only
    affects the variable cost and, through the variable cost, it affects the total cost.

    1 2 3 4 5 6 7
    20
    40
    60

    AC
    AV C

    AFC

    MC

    Q

    AC,AFC,AV C,MC

    Figure 8: Average and Marginal Costs

    26

    Figure 8 presents the different functions we have been considering. Notice that the AFC always
    decreases as Q increases, so this will always push the AC down. This is usually the big source
    of economies of scale. If you remember there are economies of scale when the AC decreases as
    output increases, and there are diseconomies of scale when the AC increases as output increases.
    Since in the AV C function the coefficient on Q is negative and the one on Q2 is positive and
    smaller in magnitude than the one on Q, the AV C will also decrease as Q increases for small
    values of Q. As Q increases enough, the AV C will start to increase. So for small values of Q
    we will see both the AFC and AV C decrease, and both give rise to economies of scale. As Q
    increases, the AFC will continue to decrease, but eventually the AV C starts increasing. This will
    cause the AC to decrease at first, and then start increasing when the increasing AV C overcomes
    the magnitude of the decreasing AFC. Finally, since AC “ AFC`AV C, notice that the vertical
    distance between the AC and AV C curves is the FC. As Q increases, and the FC becomes
    smaller, the AV C comes closer and closer to the AV C, with the vertical distance between them
    going to zero.

    Let’s explore now the relationship between the MC and the average costs. Remember that the
    only average costs that are related to the MC are the AC and AV C, since the FC is not related
    to Q. In figure 8 we can see the relationship we have explored between a marginal function and
    an average function. Notice that both the AC and AV C decrease when the MC is below each
    respective average cost curve. As soon as the MC rises above each of the average cost curves,
    the respective average cost curve starts increasing. This means that the MC curve intersects the
    AC and AV C curves at the points, where each respective average cost curve is at its minimum
    value. Since the marginal cost decreases first and increases after, both average cost curves are
    U-shaped, and the MC curve intersects both at their respective minimum.

    Notice that the analysis we have done here has been by using some specific form for the revenue
    and cost functions. However, it is general enough that it allows you to explore the relationship
    between output, revenue and cost, and therefore profit. It also gives you an idea of where you
    will have economies of scale and why.

    6 Higher Order Derivatives

    Just as there is a derivative of a function, there is a derivative of the derivative function, or
    second derivative of the primitive function. Since we can keep deriving these derivative
    functions, as long as the resulting function is differentiable, we have what we call the order of
    derivative just to tell us whether it is the first, second, third, etc. . . order derivative, where the
    order refers to how many times we have derived relative to the primitive function, fpxq. This

    27

    way we have

    f 1pxq ” dy
    dx

    first (order) derivative

    f2pxq ” d
    2y

    dx2
    second (order) derivative

    f3pxq ” d
    3y

    dx3
    third (order) derivative

    f p4qpxq ” d
    4y

    dx4
    fourth (order) derivative

    f pnqpxq ” d
    ny

    dxn
    nth (order) derivative

    Example 16

    Let’s find all the possible derivatives for the following function

    y “ fpxq “ 4×4 ´ x3 ` 17×2 ` 3x´ 1

    f 1pxq “ 16×3 ´ 3×2 ` 34x` 3

    f2pxq “

    48

    x2 ´ 6x` 34

    f3pxq “ 96x´ 6

    f p4qpxq “ 96

    f p5qpxq

    “ 0

    A polynomial is an interesting example. First, since we can take the derivative of a constant, a
    polynomial of order n, is n`1 times differentiable. In our case we had a 4th-order polynomial,
    and we have up to a fifth derivative. We should be careful with is the fifth derivative. We
    see that it’s equal to zero. That doesn’t mean that the derivative doesn’t exist, but rather
    that it is zero because the derivative exists. Remember that a derivative measures a rate
    of change in a function as we change x, per unit of x. In this case the rate of change we’re
    measuring is that of the fourth derivative. Since this one is always 96, no matter what the
    value of x is, there is no change in the fourth derivative as we change the x, so the derivative
    of the fourth derivative, i.e. the fifth derivative, equals 0 at all values of x.

    An example I like to give that usually helps understanding what a first derivative and a second
    derivative represent is to consider that y is the distance traveled by a person and it’s a function
    of time, fptq. We should expect that f 1ptq ą 0 because the more time you spend traveling the

    28

    more distance you will have covered, no matter what method of transportation you’re using.
    What does f 1ptq represent? Now that you’ve dealt with derivatives, you should quickly come
    up with the answer: it’s the instantaneous rate of change in the distance traveled per unit of
    time change. Yes, clearly that’s the definition, but this has a simple name in English: speed!
    Notice that it’s the meters per second, the miles per hour, or whatever units you’re measuring
    the speed in. What about the second derivative, f2ptq? Since we know that the first derivative
    is the speed, the second derivative is the instantaneous rate of change in the speed per unit unit
    of time change. Wait, this also has a word in English: acceleration!. Let’s look at a simple
    example that will show this.

    Example 17

    John is traveling from New York to Boston by car. In his first hour driving he covers a
    distance of 50 miles, since he has to go through the heavy traffic one usually finds when
    leaving NYC.

    If we let d “ fptq where d measures distance in miles, and t measures time in hours, this
    information is telling us that d0 “ fp0q “ 0 miles, and d1 “ fp1q “ 50 miles. The rate of
    change between these two points is

    ∆s

    ∆t
    “ 50´ 0

    1´ 0 “ 50 miles per hour.

    So the average speed during the first hour is 50 miles per hour. After two hours driving he
    has covered 120 miles. Again, this tells us that d2 “ fp2q “ 120 miles. The rate of change
    between the first hour and the second hour is

    ∆s

    ∆t
    “ 120´ 50

    2´ 1 “ 70 miles per hour.

    We see that during the second hour John’s average speed is higher than in his first hour.
    How much did the speed change between the first and second hour?

    ∆p∆s{∆tq
    ∆t

    “ ∆
    2s

    ∆t2
    “ 70´ 50

    2´ 1 “ 20 miles per hour squared.

    Even though in example 17 is not looking at derivatives, but rather discrete rates of change, it
    helps illustrate the concept of the rate of change in the rate of change, or second derivative. The
    concepts of speed and acceleration are ones we grow up with from our experience in traveling,
    so they help grasp what we’re doing each time we derive.

    6.1 The Second Derivative and Concavity and Convexity

    The second derivative is very useful in setting the conditions for a certain characteristic of a
    function: whether it is convex or concave at a point. Let’s explore the concepts of convexity and
    concavity before we use the second derivative to set the conditions.

    29

    A function, fpxq, is weakly convex in an interval pa, bq in its domain, if and only
    if any linear combination of the function values for any two points c P pa, bq and
    d P pa, bq, fpcq and fpdq, is either greater or equal to the the value of the function at
    linear combination of the original values, c and d, and it is strictly convex if every
    possible linear combination of fpcq and fpdq is always greater than the function’s
    value at the linear combination of the original values, c and d.

    Similarly

    A function, fpxq, is weakly concave in an interval pa, bq in its domain, if and only
    if any linear combination of the function values for any two points c P pa, bq and
    d P pa, bq, fpcq and fpdq, is either less or equal to the the value of the function at
    linear combination of the original values, c and d, and it is strictly concave if every
    possible linear combination of fpcq and fpdq is always greater than the function’s
    value at the linear combination of the original values, c and d.

    y

    x
    1

    1
    2
    2
    3
    3
    4
    4
    5
    5
    6
    6

    A

    B

    C

    D

    E

    y
    x
    1
    1
    2
    2
    3
    3
    4
    4
    5
    5
    6
    6

    A
    B

    F

    G

    H

    (a) (b)

    Figure 9: Weakly Convex and Concave Functions

    To explore the meaning of these definitions, consider figure 9, where figure 9a presents a weakly
    convex function in all the interval graphed, and figure 9b a weakly concave function in all the
    interval graphed. In a graph, the possible points that can result from a linear combination of
    two points are the points on the straight line that joins those two points.

    Concentrate on figure 9a first. The function there is given by

    y “

    1` 0.5x if 0 ď x ď 3
    3.25´ x` 0.25×2 if x ą 3

    Let’s look at points A and B. Point A is point p2, 2q, so that fp2q “ 2, and point B is p3, 2.5q,
    so that fp3q “ 2.5. Now, arithmetically a linear combination of any two values, c and d, is given

    30

    by

    e “ αc` p1´ αqd, 0 ă α ă 1. (40)

    This is telling us that the linear combination of two values is a weighted average of those two
    values. There are infinite linear combinations of the values c and d. So consider that α “ 0.25.
    Using the x values of points A and B we have that

    0.25ˆ 2` p1´ 0.25q ˆ 3 “ 2.75.

    When x “ 2.75, we have that y “ fp2.75q “ 1` 0.5ˆ 2.75 “ 2.375. We also have that the linear
    combination of the function’s values at points A and B, where α “ 0.25, is

    0.25ˆ fp2q ` p1´ 0.25q ˆ fp3q “ 0.25ˆ 2` 0.75ˆ 2.5 “ 2.375.

    We see that the function’s value at the linear combination of the x values of the two points is
    the same as the linear combination of the function’s values of the original two x values. This
    satisfies the condition for both weakly convex and weakly concave.

    Consider, now, points A and C in figure 9a. We know that point A is p2, 2q, and point C is
    actually p5, 4.5q, so that fp5q “ 4.5. Using, again, α “ 0.25, we have that the linear combination
    of the x values is

    0.25ˆ 2` p1´ 0.25q ˆ 5 “ 4.25.

    Since 4.25 ą 3, we have that fp6.5q « 3.52. The linear combination of the values of the function
    is given by

    0.25ˆ fp2q ` p1´ 0.25q ˆ fp5q “ 0.25ˆ 2` 0.75ˆ 4.5 “ 3.875.

    We see that 3.875 ą 3.52, so this indicates that the function is convex. Looking at the graph we
    see that point D is the function’s value at the linear combination of the x values, so p4.25, 3.52q,
    and point E is given by the linear combination of the function’s values of the original points, so
    p4.25, 3.875q. You can see that point E falls on the line joining points A and C, which follows
    what we said before that any linear combination of those two points would fall on the straight
    line joining them. Now, remember that for the function to be weakly convex interval, that the
    value of any linear combination of any two points in the interval must be greater or equal than
    the value the function would take at the linear combination of the original x values. We see that
    any linear combination of any two points where x ď 3 would fall on top of the function, because
    it is a straight line for those values. That was the case with points A and B. As soon as any
    one of the points is at x ą 3, the linear combination of the values of the function will always be
    greater than the value of the function of the original linear combination.

    Consider, now, figure 9b. The function there is given by

    y “

    1` 0.5x if 0 ď x ď 3
    ´1.25` 2x´ 0.25×2 if x ą 3.

    31

    We see, then, that for x ď 3, the function is identical to that of figure 9a. For x ą 3, however, the
    function has changed. Clearly any linear combination of points A and B still lie on the function,
    like we saw in the previous case. As we said there, this satisfies both weak convexity and weak
    concavity. Now, consider that we do a linear combination of points A and F . Point F is given
    by p5, 2.5q, so fp5q “ 2.5 in this case. Using α “ 0.25, we have that the linear combination of
    the original x values is still

    0.25ˆ 2` p1´ 0.25q ˆ 5 “ 4.25,

    so, now, fp4.25q « 2.73. The linear combination of the function’s values of the original points is

    0.25ˆ 2` p1´ 0.25q ˆ 2.5 “ 2.375.

    Since 2.375 ă 2.73 we have that the function is concave. In the graph point G is the point
    given by the linear combination of the original x values and the value of the function there.
    Point H, on the other hand, is given by the linear combination of the original x values, and the
    linear combination of the function’s values at those original x values. You can see that the linear
    combination falls on the line joining points A and F , as expected, and that point G is above
    point H. The function is, thus, concave. To be weakly concave in the interval the function’s
    value at the linear combination of any two x values in the interval, has to be greater or equal to
    the linear combination of the function’s values at the chosen x values.

    Let’s consider how we can determine if a function is convex or concave at a point. Remember
    that when we consider something at a point, what we really look at is what happens at small
    increments around the point. So you can think of the linear combination between two points
    infinitesimally close to the point. We’ve seen that both weakly convex and weakly concave share
    that the linear combination of the values of the function can be equal to the value of the function
    at the linear combination of the chosen x values. Notice that this happens when the function is
    a straight line between the two points that we choose to linearly combine. In those cases, the
    second derivative of the function is 0. Unfortunately we can have that the second derivative of
    the function can also be zero at an inflection point, where the function changes from being convex
    to being concave. This means that we cannot determine whether a function is weakly convex
    or concave in its domain by simply looking at the second derivative, because even if the second
    derivative is equal to zero, it may be that it is an inflection point, and not that the function is a
    straight line at that point. Using the second derivative we can only determine, then, whether a
    function is either strictly convex or strictly convex. We have, then, that

    If at x “ x1 f2px1q ą 0, then a function is strictly convex at px1, fpx1qq.

    Similarly

    If at x “ x1 f2px1q ă 0, then a function is strictly concave at px1, fpx1qq.

    Notice that these two are sufficient conditions, but not necessary. Even when f2px1q “ 0 a
    function can be strictly convex or strictly concave. However, if the second derivative at the point
    is positive we know for sure that the function is convex, and if it’s negative we know for sure that

    32

    the function is concave. Notice that in order to use the second derivative to determine whether
    the function is convex or concave at a point, the function must be twice differentiable at that
    point.

    6.2 Attitude Towards Risk

    A great application of the concepts of concavity and convexity in Economics comes with the
    attitudes towards risk. Consider the following game. You can pay a fixed amount of money in
    advance to toss a coin. If the coin lands heads you collect $10, and if the coin lands tails you
    collect $20. How much you’re willing to pay to enter this game depends on how you value the
    uncertainty (risk) of the payout. Since we know that each outcome has a probability of 0.5, we
    know that the expected value of the game is

    EV “ 0.5 ¨ $10` 0.5 ¨ $20 “ $15.

    Let this be a fair game, that is that the cost of entering the game is exactly its expected payoff:
    $15. A person’s attitude towards risk can be thought of as whether (s)he values having $15 for
    sure more or less than entering the game. A risk averse person will decline entering the game
    every time because (s)he values having $15 for sure more than the possibility of winning $5
    when it’s accompanied by the possibility of losing $5. On the other hand, a risk loving person
    will always play this game, because (s)he values the possibility of winning $5 more than the
    possibility of losing $5, or in other words prefers the uncertainty (risk) to the security of having
    $15. A risk neutral person will be indifferent between playing the game or not, because (s)he
    sees both scenarios as identical.

    Upxq

    x
    10 15 20

    M

    A

    N

    B

    O 5

    Up$15q
    EU

    Upxq
    x
    10 15 20

    M 1
    A1

    N 1

    B1

    O 5
    Up$15q
    EU

    paq Risk Averse pbq Risk Loving

    Figure 10: Attitudes Towards Risk

    In Economics we always use the concept of utility when referring to people’s valuations. In this
    case we can think about the utility of money, Upxq, where Up¨q is the utility function and x is
    money. Here we have three possible utilities we have to think about: the utility the individual
    would have if the money was $10, i.e. the coin landed heads; the utility the individual would

    33

    have if the money was $20, i.e. the coin landed tails; and the utility if the money was $15, i.e.
    the individual didn’t play the game. If the individual didn’t play the game, (s)he would have
    a utility of Up$15q. How do we calculate the utility of the game to compare it to the utility of
    having $15? If you play the game you will have two possible outcomes, so you will have two
    possible utilities: Up$10q if the coin lands heads, and Up$20q if the coin lands tails. We know
    the probabilities that those two possible outcomes have, so prior to the game you would have an
    expected utility of

    EU “ 0.5 ¨ Up$10q ` 0.5 ¨ Up$20q.

    Notice that both the EV and the EU are linear combinations, using the same weights: in this case
    α “ 0.5. The EV is a linear combination of the x values, and the EU is the linear combination
    of the values of the function, utility function in this case, at those x values. In figure 10a we
    have the case of the risk averse individual. His/her utility of having $15 for sure, i.e. Up$15q, has
    to be greater than his/her expected utility from the game. Point M represents the utility the
    individual would have if the amount of money is $10, and point N represents the utility of $20.
    These are the two possible payouts. Point B represents EU , and, as expected, it’s on the line
    that joins points M and N , exactly the midpoint because in this case both weights are 0.5. Point
    A represents the utility the individual has of having $15 for sure. Notice that this would be the
    utility the player would be giving up in exchange for the expected utility at point B. Clearly the
    risk averse individual would never play this game because he gets more utility from holding the
    $15 than (s)he expects to get from the game. This means that for a risk averse individual his
    utility of money function is strictly concave, because the curve will always be above any linear
    combination of any two points on the curve.

    Figure 10b presents the case of the risk loving individual. In this case point M 1 represents the
    utility of having $10, and point N 1 the utility of having $20. Like before EU is the middle point
    in the line that joins these two points. This time this is represented by point B1. For the risk
    loving individual this point should have more value than having $15 for sure, which means that
    this point ought to be above the point on the utility curve at x “ 15, A1. This is exactly what
    we observe. Notice, then, that for a risk loving individual the utility of money function has to
    be strictly convex, because the expected utility of a game always has to be above the utility of
    the certain value.

    Having seen what the utility function looks like for both risk averse and risk loving individuals,
    can you determine what the utility of money would look like for a risk neutral individual?

    7 Maxima and Minima

    A global (absolute) maximum is a point in the function’s domain where the function achieves
    its highest value in all its domain. Similarly, a global (absolute) minimum is a point in the
    function’s domain where the function achieves its lowest value in all its domain. The global
    maximum or minimum could be at points at each end of the function’s domain, or could be at
    points in the middle of the domain. When we have a maximum or a minimum at points that

    34

    are not in the extreme of the function, we call these local (relative) maximum or minimum,
    respectively. A local maximum can be, but it is not necessarily, a global maximum. Similarly, a
    local minimum can be, but is not necessarily, a global minimum.

    y
    x

    A
    B C

    y
    x
    D
    y
    x
    E
    F

    paq pbq pcq

    Figure 11: Maximum and Minimum

    To better understand these concepts, consider figure 11. In figure 11a we see that all points on
    the line have the same value for the function, so each point on the line is both a maximum and
    a minimum of the function. Clearly, with this type of function, there is no interest in choosing a
    particular value of x in the domain of the function, because they all return the same value. The
    function in figure 11b is strictly increasing as x increases. There is no finite maximum as long as
    the domain of x is the set of non-negative real numbers. If y, however, is constrained to not be
    non-negative, the minimum would be at x “ 0, i.e. at point D. That would actually be the global
    minimum of y. Points E and F in figure 11c are examples of a local maximum and minimum,
    respectively. This is because they’re each an extreme value in the neighborhood of the point only.
    A relative extremum can be, but is not necessarily, an absolute extremum. For example, point
    E is a relative maximum, but there is no guarantee that it’s an absolute maximum, although it
    could be depending on what the domain of the function is. A similar story could be said about
    point F .

    In economics we model human behavior using mathematics. We usually do so by setting up
    a maximization or minimization problem, where we would like to find the value of the input
    variables, the x, where the function has a maximum or a minimum. For example, we can model
    human consumption through a utility maximization problem where the consumer chooses the
    level of expenditure that maximizes his/her utility. Similarly, we can model consumption through
    an expenditure minimization problem where the consumer must buy a certain quantity of goods.
    In the next two sections, we actually see two examples of this modeling. Now, if a function is
    strictly increasing in its input, as is the case in figure 11b, notice that the maximum could be
    at x “ 8, so it would be indeterminate. In those cases there would be a constraint, that sets
    the maximum value of the input variable you will be able to have. The problems that are more
    interesting to model, however, are those where there is a local maximum or minimum, i.e. the
    point of interest is not at an extreme of the domain of the function. Since this is usually what

    35

    we look for when modeling, it is imperative that we know how to find the points in a function
    where there may be a local maximum or a local minimum.

    7.1 Conditions for a Local Maximum or Minimum

    Consider a local maximum or minimum in a function that is differentiable at that point. If you
    were to move away from that point in either direction very slightly the value of the function
    must have not changed much. In fact the limit of ∆y as ∆x approaches zero must be zero. This
    is saying that dy “ 0 in a local maximum or minimum. Remember from equation (11) that

    dy “ f 1pxqdx.

    When ∆x approaches zero, dx ‰ 0. I know I’ve mentioned this before, but it’s very important
    that you keep understanding that. The differential of x may be infinitesimally small but it is
    not zero. How can dy “ 0 when dx ‰ 0? The answer is clear, f 1pxq must be zero. For a local
    maximum or a local minimum in a differentiable function at that maximum or minimum we,
    then, need that f 1pxq “ 0. Is it enough that f 1pxq “ 0 for a differentiable function at that point
    for there to be a maximum or a minimum? The answer is no.

    y
    x

    y “ fpxq

    y1 “ f 1pxq

    y2 “ f2pxq

    A

    x0

    Figure 12: An Inflection Point

    Consider figure 12. We can see that f 1px0q “ 0 because the slope of the line tangent at A is 0,
    and the red curve that represents the first derivative, y1, is at zero at that point. However, the
    function has neither a local maximum or a minimum at that point. Point A is, in fact, what we
    call an inflection point: a point where the function changes from being strictly concave to being
    strictly convex, or vice-versa. Not all inflection points will have a first derivative that is equal

    36

    to zero, but all of them will have a second derivative that is zero. However, not all points with
    a second derivative equal to zero are necessarily inflection points either.

    So far we have seen that in order for a differentiable function to have a local maximum or
    minimum in its domain, it is necessary but not sufficient for the first derivative at that point
    to equal 0. We, therefore, need something else. Consider, again, figure 11c. At point E, where
    the function has a local maximum, the function is strictly concave. Similarly, at point F , where
    the function has a local minimum, the function is strictly convex. So if we have that the first
    derivative at the point is zero, and the second derivative is negative, which is the sufficient
    condition for the function to be strictly concave at a point, we are sure to have a maximum.
    Similarly, if we have that at a point the first derivative is zero and the second derivative is
    positive, the sufficient condition for a function to be strictly convex at a point, we’re sure to have
    a minimum. As we keep mentioning these conditions for strict convexity and strict concavity are
    sufficient, but not necessary.

    We are now ready to look at the conditions for a local maximum or minimum.

    For a twice continuously differentiable function fpxq to have a relative extremum at
    x “ x0 it is necessary that

    1. f 1px0q “ 0,

    2. and that

    (a) f2px0q ď 0 for a relative maximum, or

    (b) f2px0q ě 0 for a relative minimum.

    It is sufficient that

    1. f 1px0q “ 0,
    2. and that

    (a) f2px0q ă 0 for a relative maximum, or

    (b) f2px0q ą 0 for a relative minimum.

    The conditions show that for a maximum it is necessary both that the first derivative equals zero
    and the second derivative is less or equal to zero. This means that in addition for the function
    to have a slope of 0 at the point, it must be weakly concave. Similarly, it is necessary for a
    minimum that the first derivative is equal to zero and the second derivative has to be greater or
    equal to zero, i.e. weakly convex. However, these conditions don’t suffice in either case because
    as we saw an inflection point will have a first and second derivatives equal to zero, and is neither
    a local maximum or a local minimum. That is why we present the other set of conditions, where
    for a maximum it is sufficient that the first derivative is equal to zero and the second derivative

    37

    is less than zero at the point, and for a minimum it is sufficient that the first derivative is zero
    and the second derivative is positive at the point. This last set of conditions guarantees that
    there is a maximum or a minimum, depending on which set is satisfied, at the point, but if these
    conditions are not met, we are not certain that there is neither a maximum or a minimum at the
    point, because they are not necessary.

    We usually refer to the condition about the first derivative as the first-order condition and the
    condition about the second derivative as the second-order condition, in clear reference to the
    order of the derivative involved in each of the conditions. I would like to emphasize that the first
    set of conditions are necessary only for twice continuously differentiable functions, but not for
    all functions. It may be the case that we have a relative maximum or minimum but that either
    the first derivative or the second derivative doesn’t exist at the point where we have the relative
    maximum or minimum.2 An example is y “ |x|. This function as a relative minimum at x “ 0,
    but it’s not differentiable at that point. The second set of conditions are always sufficient because
    if they are met, the function is twice continuously differentiable at that point. The second set of
    conditions are never necessary, though.

    Example 18

    Find the local extrema (maximum or minimum) of the following function and determine
    whether they’re a maximum or a minimum

    y “ fpxq “ x3 ´ 12×2 ` 36x` 8.

    We see that the function is twice continuously differentiable, so we know that it will have
    a relative extremum where the first derivative is zero, since this is necessary. The first
    derivative is

    y1 “ 3×2 ´ 24x` 36.

    To find the values of x at which y1 “ 0

    x˚ “ 24˘
    ?

    2

    42

    ´ 4 ¨ 3 ¨ 36
    2 ¨ 3 “ 4˘ 2

    We have, then, that x˚1 “ 2 and x˚2 “ 6. So far we know that we could have a relative
    extremum in two cases, but we do not know for sure if we do. The only way we know for
    sure is if the second derivative is different from zero in any of those two points. If it equals
    zero we may have a local extremum but we may not. The second derivative is

    y2 “ 6x´ 24.

    2You should notice that if the first derivative doesn’t exist at a point, the second derivative doesn’t exist either.

    38

    At x “ 2 we have that f2p2q “ ´12 ă 0. This means that at x “ 2 we would have a relative
    maximum. The value of y at that point is

    fp2q “ 23 ´ 12 ¨ 22 ` 36 ¨ 2` 8 “ 40.

    At x “ 6 we have that f2p6q “ 12 ą 0. This means that at x “ 6 we would have a relative
    minimum. The value of y at that point is

    fp6q “ 63 ´ 12 ¨ 62 ` 36 ¨ 6` 8 “ 8.

    We therefore have a relative maximum at A “ p2, 40q and a relative minimum at B “ p6, 8q.
    Notice that if in either case the second derivative was equal to zero we would have not been
    able to determine whether we had a relative extremum at all, unless we graphed the function
    and saw it graphically.

    8 Profit Maximization

    We now consider the first of two cases of what we call optimization: finding the value of the
    input variable where we have the optimal value of what we call an objective function. Notice
    that whether a function has a maximum or a minimum in mathematics, doesn’t make that value
    optimal in any sense. It’s simply a characteristic of a function. It is us, as economists, that
    decide whether the maximum or the minimum of a certain function is optimal, and we do so by
    observing human behavior, and determining how people behave.

    The case we’re considering now, is a basic case of profit maximization. As modelers we assume
    that the objective of a firm is to produce and sell at the point that maximizes its profit. That
    is why that point where the firm reaches its maximum profit is optimal: because it satisfies the
    objective of the firm. In any optimization case, we will have an objective function. Since the
    objective of the firm is to maximize profit, the objective function of the profit maximization
    problem is the profit function:

    πpQq “ RpQq ´ CpQq.

    Notice that the profit function is a function of just one input: Q. The firm, then, must decide on
    the quantity to produce. The variable or variables that we decide or choose on in an optimization
    problem are the endogenous variables of the problem, because they are decided (determined)
    in the problem. In our case, the endogenous variable of the problem is Q, the firm’s output. We,
    then, write the profit maximization problem as

    max
    Q

    πpQq “ RpQq ´ CpQq. (41)

    Equation (41) expresses the maximization problem. First, it tells us that we are maximizing
    something, because it has the keyword max. Second, it tells us that we are maximizing with
    respect to, because it has the variable that we choose under the keyword max. Finally, third, it
    shows the objective function, the one that we have to maximize or minimize.

    39

    Now that we know how to setup the maximization problem how do we go about it? The process
    is always the same:

    1. We use the first-order condition to find the values of the endogenous variable where the
    objective function may be optimized (maximized in this case)

    2. We use the second-order conditions to determine whether we have a maximum or a mini-
    mum at those values of the endogenous variable we found using the first-order conditions.

    Before we consider an example, let’s consider the first-order condition from the problem in
    equation (41). This is nothing but that the first derivative has to equal zero. This means that

    π1pQq “ 0
    R1pQq ´ C 1pQq “ 0
    R1pQq “ C 1pQq
    MRpQq “MCpQq, (42)

    since the marginal revenue is the first derivative of the revenue function with respect to output,
    and the marginal cost is the first derivative of the cost function with respect to output. Clearly,
    we can’t solve for Q because we have the functions in general form, but what equation (42)
    shows, is that the profit maximizing output will be at a level where the marginal revenue, the
    slope of the revenue function, is equal to the marginal cost, the slope of the cost function. This
    is something that it was told to you in your principles course without really explaining why, or
    maybe using a graph to show you. Now you know how that condition springs, and why.

    Let’s consider the sufficient second-order condition of the problem. Since it’s a maximization
    problem, this would be that the function is strictly concave at the profit-maximizing point, i.e
    that the second derivative of the profit function is negative

    π2pQq ă 0
    R2pQq ´ C2pQq ă 0
    R2pQq ă C2pQq
    MR1pQq ăMC 1pQq. (

    43

    )

    Equation (43) shows that at the profit maximizing output, although the marginal revenue equals
    the marginal cost, it must be that the slope of the marginal revenue curve is less than the slope
    of the marginal cost curve. Let’s consider the example.

    Example 19

    In this example we maximize the profit function in equation (6), so the profit maximization
    problem is

    max
    Q

    π “ ´10` 12Q2 ´ 3Q3.

    40

    Using the first-order condition

    24Q´ 9Q2 “ 0
    Qp24´ 9Qq “ 0,

    so we have that Q1 “ 0, and

    24´ 9Q2 “ 0
    9Q2 “ 24
    Q2 “ 2.67

    We have, then, two possible values where profit is maximized. To determine which one
    actually yields a maximum we check the second-order condition. That is, that the second
    derivative has to be negative. The second derivative is given by

    π2pQq “ 24´ 18Q.

    This means that

    π2pQ1q “ 24´ 18 ¨ 0 “ 24 ą 0,

    and

    π2pQ2q “ 24´ 18 ¨ 2.67 “ ´24 ă 0.

    Since the condition is that the second derivative is negative, we see that the profit maximizing
    output os Q˚ “ 2.67. We usually use the ˚ to indicate that the value of the variable is the
    solution to the optimization problem.

    Now that we have the output, we can find the maximum profit. This is given by

    π˚ “ ´10` 12 ¨ 2.672 ´ 3 ¨ 2.673 “ 18.44.

    9 Optimal Timing

    The problem of optimal timing is one we encounter in many economic decisions, for example the
    best time to cut the trees to produce timber. It is, in fact, a profit maximization problem, but
    we now have a time factor. Let’s consider the timber example to get an idea of how we address
    these problems. Clearly, the trees need to grow to produce a certain level of timber. Growing
    the trees has a certain operational cost, but we can assume that the costs are proportional to the
    size of the trees and, thus, to time, so we can assume a certain per tree profit.3 The key issue in
    these problems is how to account for time. The fact that the trees need to grow means that the
    profit that we extract from the timber itself, always increases with time. The problem is that a

    3Notice that if all trees are the same, then the total profit is nothing but the per tree profit times the number
    of trees, which is constant.

    41

    dollar tomorrow is not worth the same as a dollar today, to us. Why? Because if we were to cut
    the timber sooner, we could invest the profit we extract from selling the amount of timber we
    collect in treasury bonds, and make an interest rate on the profit we extracted. The opportunity
    cost of not cutting the timber and selling the timber now, is the interest that we would make on
    the profit. How do we account for this opportunity cost? We need to consider the present value
    of the profit we make, discounting the profit at a given point in time at the interest rate, so the
    problem is one of maximizing the present value of the profit by choosing the time at which to
    cut the timber, given the initial value of the trees, the value growth rate, and the interest rate.

    One of the major characteristics of these problems is that we assume continuous growth and
    discounting. What do we mean by this? this means that to grow the value of the trees we
    multiply by e raised to the trees growth rate times t, the time, and to discount the future profit
    to the present value, we multiply by e raised to the power of the negative interest rate times t.
    An interesting consequence of this continuous growth and discounting is that we can take the
    natural logarithm of the present value function to get the first and second order conditions of
    the maximization problem, because taking the natural logarithm simplifies the process since it
    tends to get rid of the exponentials. The reason for this, is that taking the natural logarithm is a
    monotonic transformation of the original function. A monotonic transformation of a function is
    one that, although it may change the scale of the values that the function returns, it preserves the
    order of the values that the function returns. This means that if fpx1q ą fpx2q then lnrfpx1qs ą
    lnrfpx2qs, for any x1 and x2 in the domain of fpxq. In the case of the natural logarithm, the
    range of the function must be positive values, since otherwise the natural logarithm would be
    indeterminate. Since the profit of the trees is always positive, we can apply this trick here. Let’s
    look, then, at an example.

    Example 20

    The current value of the tree plantation is $K. This value grows with time at a rate of 2
    ?
    t.

    Letting the interest rate be r, what is the optimal time at which cut the trees and sell the
    timber?

    The first step is to set the value of the trees as a function of time. Since the growth rate is
    2
    ?
    t, this means that

    V ptq “ Ke2
    ?
    t.

    We don’t want to maximize the value of the trees, but rather the present value of the function.
    To find the present value we need to discount the value at the interest rate, r. Therefore

    PV ptq “ V ptqe´rt “ Ke2
    ?
    t “ Ke2

    ?
    te´rt “ Ke2

    ?
    t´rt.

    The maximization problem can be expressed as

    max
    t

    PV ptq “ Ke2

    ?
    t´rt.
    42

    I solve, now, this problem as is and, after that, I show you how maximizing the natural
    logarithm of the present value yields the same result. We have, then, that the first order
    condition is

    PV 1ptq “ 0

    Ke2
    ?
    t´rt

    ˆ

    1?
    t
    ´ r

    ˙
    “ 0

    Ke2
    ?
    t´rt 1?

    t
    “ Ke2

    ?
    t´rtr

    1?
    t
    “ r

    ?
    t “ 1

    r

    t˚ “ 1
    r2
    .

    Notice that, since r is in the denominator, a larger interest rate implies that the optimal time
    to sell is sooner, i.e. t˚ decreases as r increases. This is how it should be, because r is the
    opportunity cost of not cutting now. The second order condition requires that the second
    derivative is negative. Let’s check on that.

    PV 2ptq

    “ Ke2
    ?
    t´rt

    ˆ
    1?
    t
    ´ r

    ˙2

    `Ke2
    ?
    t´rt

    ˆ

    ´ 1

    2
    ?
    t3

    ˙
    “ Ke2
    ?
    t´rt
    ˆ
    1

    t
    ` r2 ´ 2r?

    t
    ´ 1

    2
    ?
    t3
    ˙
    .

    Notice that ex is always positive for any x, so in order for this to be negative, it must be that
    the term in parenthesis is negative, since K is positive. Evaluating the term in parenthesis
    at the solution, we have

    1

    1{r2 ` r
    2 ´ 2r

    1{r ´
    1

    2{r3

    “r2 ` r2 ´ 2r2 ´ r
    3

    2

    “´ r
    3

    2
    ă 0.

    We now apply the trick I mentioned of taking the natural logarithm before maximizing, and
    you’ll see how we reach the same optimal value and conclusion of the second order condition.

    43

    The natural logarithm of the present value is

    lnPV ptq “ lnKe2
    ?
    t´rt

    “ lnK ` ln e2
    ?
    t´rt

    “ lnK ` p2
    ?
    t´ rtq ln e

    “ lnK ` 2
    ?
    t´ rt.

    We can, thus, express the maximization problem as

    max
    t

    lnPV ptq “ lnK ` 2
    ?
    t´ rt.

    The first order condition from this problem should lead us to the same solution:

    d lnPV

    dt
    “ 0

    1?
    t
    ´ r “ 0

    1?
    t
    “ r
    ?
    t “ 1
    r
    t˚ “ 1
    r2
    .

    This is the same result that we had before, but it was much easier to reach because in the
    derivative we didn’t get stuck with the exponential parts of the function. We now check the
    second order condition of the problem.

    d2 lnPV ptq
    dt2

    “ ´ 1
    2
    ?
    t3
    .

    This is negative for any t because t ě 0. Clearly, it must be negative at t˚, since at t˚ it
    equals ´r3{2, which is what we got from the term in the parentheses before.

    Once, we have the solution, we can express the present value function in terms of just the
    interest rate by substituting it in the present value function. This will give us the optimal
    present value function

    PV ˚ “ Ke2
    ?

    1{r2´rp1{r2q

    “ Ke2{r´1{r

    “ Ke1{r.

    44

    Now we have the optimal time and the present value in terms of the interest rate. If the
    interest rate is 4%, we have that

    t˚ “ 1
    0.042

    “ 625

    PV ˚ “ Ke1{0.04 “ 72 ¨ 109K

    If the interest rate is 10%

    t˚ “ 1
    0.12

    “ 100

    PV ˚ “ Ke1{0.1 “ 22, 026.47K

    10 Approximating Functions

    Many times nonlinear functions are not easy to handle in calculus. Not necessarily for derivatives,
    but many times in handling limits or integrals. In those cases we like to approximate the function
    with a polynomial of a certain order n, so

    y ” fpxq « a0 ` a1x` a2x2 ` a3x3 ` ¨ ¨ ¨ ` anxn. (44)

    Equation (44) tells us that y, which is equivalent to fpxq, can is approximately equal to the
    polynomial to the right. This means that there is a remainder, i.e. a difference between the actual
    y value for x, and the value of the polynomial. Letting Pn ” a0` a1x` a2x2` a3x3` ¨ ¨ ¨ ` anxn,
    and Rn be the remainder, we have that

    y ” fpxq ” Pn `Rn. (45)

    We will consider a general form of this remainder later, but notice that we’re not very interested
    in the remainder per se. We know that it exists, which is why using just the polynomial part,
    Pn, is an approximation. However, our major interest lies on Pn.

    In order to approximate the function, we need to select a point in the domain of the function
    around which we build the polynomial, also called as expanding the function, and the order of
    the polynomial. We are going to consider two cases: the Maclaurin series, which expands the
    function around x “ 0, and Taylor series, which expands the function around a general point
    where x “ x0 and x0 will be a specific value of x in the function’s domain. In fact, the Macluarin
    series is a Taylor series where x0 “ 0.

    10.1 Macluarin Series

    What we need to determine is the values for the different coefficients in the polynomial: a0, a1, . . . , an.
    With the Macluarin series it is very straightforward because the expansion is done around x “ 0.

    45

    We, then, have that

    a0 “
    fp0q
    0!

    a1 “
    f 1p0q

    1!

    a2 “
    f2p0q

    2!

    a3 “
    f3p0q

    3!

    an “
    f pnqp0q
    n!

    ,

    (

    46

    )

    where the symbol ! is the factorial, and for a positive integer n, n! ” npn´ 1qpn´ 2q ¨ ¨ ¨ 3 ¨ 2 ¨ 1,
    where 0! ” 1. This means that, for example, 1! “ 1, 2! “ 2 ¨ 1 “ 2, 3! “ 3 ¨ 2 ¨ 1 “ 6, and so on.
    We, then, have that for a Macluarin series

    Pn “
    fp0q
    0!

    ` f
    1p0q
    1!

    x` f
    2p0q
    2!

    x2 ` f
    3p0q
    3!

    x3 ` ¨ ¨ ¨ ` f
    pnqp0q
    n!

    xn. (47)

    Te Maclaurin series is easy to remember because each coefficient involves the same order of the
    derivative, and consequently the same number before the factorial, as the exponent on the x,
    since you should realize that fpxq ” f p0qpxq, and that x0 “ 1. One thing you should realize that
    at x “ 0, Pn “ fp0q, and the remainder will be 0. So at the point of expansion, the Macluarin
    series will return the same value as the function. The only thing to determine, then, is what
    order you want to have in the Maclaurin polynomial: n.

    Example 21

    Let y “ ex. Form a 4th order Maclaurin expansion.

    The first thing we do is evaluate the function and the first four derivatives at x “ 0. Notice
    that with ex its derivative is always ex, so

    fp0q “ e0 “ 1
    f 1p0q “ e0 “ 1
    f2p0q “ e0 “ 1
    f3p0q “ e0 “ 1
    f p4qp0q “ e0 “ 1.

    46

    This means that

    y « 1` x` x
    2

    2
    ` x

    3

    6
    ` x

    4
    24

    Example 22

    Let’s now consider a slightly more difficult example. Let y “ lnp1 ` xq. Form a 4th order
    Maclaurin expansion.

    Again, the first thing we do is evaluate the function at x “ 0, and obtain the first four
    derivatives and evaluate them at 0. We have that

    fpxq “ lnp1` xq fp0q “ lnp1` 0q “ 0

    f 1pxq “ 1
    1` x f

    1p0q “ 1
    1` 0 “ 1

    f2pxq “ ´ 1p1` xq2 f
    2p0q “ ´ 1p1` 0q2 “ ´1

    f3pxq “ 2p1` xq3 f
    2p0q “ 2p1` 0q3 “ 2

    f p4qpxq “ ´ 6p1` xq4 f
    p4qp0q “ ´ 6p1` 0q4 “ ´6.

    We, therefore, have that

    y « 0` x
    1!
    ´ x

    2

    2!
    ` 2x

    3

    3!
    ´ 6x

    4

    4!

    « x´ x
    2

    2
    ` x
    3

    3
    ´ x

    4
    4
    .

    Notice that, in fact, we could build an infinite series where we just divide each x term in
    the polynomial by its respective power, and alternate the signs, and that would equal the
    exact value for fpxq at any x. Clearly, that is nice to know but not very practical, since in
    all practice it’s impossible to build an infinite series. However, with such a simple format,
    we can easily have a much higher order polynomial.

    10.2 Taylor Series

    As we mentioned before the Taylor series expands the function around a general point x “ x0.
    The process is similar to that of the Macluarin series, but has an additional caveat: we now have
    to measure the distance of the point x from the expansion point, x0. This means that the Taylor
    polynomial is given by

    Pn “
    fpx0q

    0!
    ` f

    1px0q
    1!

    px´x0q`
    f2px0q

    2!
    px´x0q2`

    f3px0q
    3!

    px´x0q3`¨ ¨ ¨`

    f pnqpx0q
    n!

    px´x0qn. (48)

    47

    Notice that when x0 “ 0, equation (48) simplifies to equation (47), which is why I said before
    that a Maclaurin series is a Taylor series where x0 “ 0. Equation (48) gives us a general form to
    find the polynomial approximation. Notice, however, that the different coefficients are no longer
    given by the evaluation of the function and its derivatives divided by the corresponding factorial.
    This is because the x ´ x0 terms will have a constant and, depending on the power it is raised
    to, x terms raised to that power. This is why this series is slightly more complex to form than
    the Maclaurin series. Let’s look at an example to see how it is done.

    Example 23

    Let’s expand φpxq “ 1
    1` x around x0 “ 1 and with n “ 4.

    Notice that φpxq “ p1 ` xq´1, so we can use the power rule to get the derivatives. Since
    n “ 4 and x0 “ 1, we have that

    φpxq “ p1` xq´1 φp1q “ 1
    2

    φ1pxq “ ´p1` xq´2 φ1p1q “ ´ 1
    22
    “ ´1

    4

    φ2pxq “ 2p1` xq´3 φ2p1q “ 2 1
    23
    “ 1

    4

    φ3pxq “ ´6p1` xq´4 φ3p1q “ ´6 1
    24
    “ ´3

    8

    φp4qpxq “ 24p1` xq´5 φp4qp1q “ 24 1
    25
    “ 3

    4

    So we have that

    φpxq

    « 1
    2
    ´ x´ 1

    4
    ` px´ 1q

    2

    4 ¨ 2! ´
    3px´ 1q3

    8 ¨ 3! `
    3px´ 1q4

    4 ¨ 4!

    « 1
    2
    ´ x´ 1

    4
    ` x

    2 ´ 2x` 1
    8

    ´ x
    3 ´ 3×2 ` 3x´ 1

    16
    ` x

    4 ´ 4×3 ` 6×2 ´ 4x` 1
    32

    « 16` 8` 4` 2` 1
    32

    ´ 4` 4` 3` 2
    16

    x` 2` 3` 3
    16

    x2 ´ 1` 2
    16

    x3 ` 1
    32
    x4

    « 31
    32
    ´ 13

    16
    x` 1

    2
    x2 ´ 3

    16
    x3 ` 1

    32
    x4.

    You can see how we need to expand the x´ x0, x´ 1 in this case, terms to determine what
    the actual coefficients on the x terms are.

    I believe that it is beneficial that you see graphically what we’re doing to have a better idea of
    what the approximation is, and what the remainder represents. Figure 13 presents the original
    function in example 23, and the polynomial approximation we found there, P4. We see that

    48

    y
    x
    1
    1
    2
    2
    3
    3
    4
    4

    φpxq

    P4
    A

    0

    Figure 13: Taylor Expansion

    at the expansion point, A, the polynomial and the function have the same value. One thing
    we haven’t mentioned, but it’s also satisfied, is that the way we have formed the polynomial
    guarantees that both the original function’s and the polynomial’s first four derivatives evaluate
    to the same values at the expansion point. The reason that it is only the first four derivatives, is
    that we are doing a 4th order expansion. As soon as we move away from the expansion point, i.e.
    x ‰ 1, we will have that the polynomial and the original function will have different values for
    the function and for the first four derivatives. How different? That usually depends on the order
    of the polynomial. In fact a Taylor, and consequently a Maclaurin, series is said to be convergent
    if, and only if, Pn Ñ fpxq as nÑ 8. What figure 13 shows is that no approximation is perfect,
    but P4 in that case performs pretty well for x P r0, 2s. After x “ 2 it starts diverging from the
    original function. This shows that when deciding the order of the expansion you will be deciding
    on the interval around the expansion point where the approximation will behave very well, and
    the interval(s) where it will not. Hopefully you will be dealing with a convergent series, because
    when needing a wider interval for analysis, you can just increase the order of the expansion.

    10.3 The Mean Value Theorem

    Before getting to consider the remainder more in detail, we should revisit a theorem that involves
    derivatives, and on which the form of the remainder is based. The mean value theorem states

    If a function fpxq is continuously differentiable in the interval ra, bs, then there exists
    a value x˚ P pa, bq such that

    f 1px˚q “ fpbq ´

    fpaq

    b´ a which means that

    fpbq “ fpaq ` f 1px˚qpb´ aq

    We are not going to prove this theorem, but rather show it graphically. Consider, then, figure 14,
    where we choose two arbitrary points pa, fpaqq, and pb, fpbqq, from a continuously differentiable
    function fpxq in the interval ra, bs. The rate of change between these two points, which is given

    49

    y
    x
    fpxq
    a
    fpaq
    b

    fpbq

    Figure 14: Mean Value Theorem

    by rfpbq´fpaqs{pb´aq, is the slope of the chord red line joining the two points. The mean value
    theorem tells us that there must exist a point px˚, fpx˚qq, where the slope at that point, f 1px˚q,
    is equal to the rate of change between the two points. In the graph we see that the dashed red
    line that is tangent at the point px˚, fpx˚qq, is a parallel to the chord joining the two points and,
    thus, has the same slope than the chord. Since the two lines are parallels, the slope at the point
    px˚, fpx˚qq is the same as that of the chord joining the two original points. This illustrates the
    first equation in the statement. The second equation is a simple re-arrangement of the first one.

    One thing that is interesting to see is that since x˚ P pa, bq, we can obtain x˚ as a linear
    combination of a and b, such as x˚ “ p1 ´ θqa ` θb, where 0 ă θ ă 1. Notice that in this case
    the weight, θ, cannot be 0 or 1. This is because x˚ belongs to the open interval, not the close
    one, formed by a and b. In other words, this is because x˚ can be neither a nor b. The second
    interesting thing is to realize this theorem is called the mean value theorem. The reason is that
    the slope of the chord that joins two points is the average slope of all the infinite points in the
    interval ra, bs. In figure 14 we see a nonlinear function in the interval. Since the slope of the
    chord is the mean, this means that at some points the slope will be larger than that of the chord,
    that at other points the slope will be smaller, and that at least at one point the slope will be the
    same as the average (mean). If the function was a linear function, we would have that all the
    points have the same slope as the average, so you would have an infinite number of x˚.

    10.4 The Lagrange Form of the Remainder

    A form of the remainder that is very useful is the Lagrange form of the remainder

    Rn “
    f pn`1qpx˚q
    pn` 1q! px´ x0q

    n`1 (49)

    50

    where x˚ “ p1´ θqx0 ` θx with 0 ă θ ă 1, which is the same as saying that x˚ P px0, xq. Is this
    looking like something similar to what we just saw in the mean value theorem? The reason is
    that the mean value theorem is a particular case of a Taylor series with a Lagrange remainder,
    where n “ 0. Remember that fpxq “ Pn ` Rn. When we have that n “ 0, P0 “ fpx0q, and
    n` 1 “ 1 so Rn “ f 1px˚qpx´ x0q. We have then that

    fpxq “ fpx0q ` f 1px˚qpx´ x0q.

    This is the same as the second equation in the expression of the mean value theorem, where
    a “ x0 and b “ x.

    The truth is that since we would have to determine x˚, this formula is not really useful. It does,
    however, allow us to learn several things about the remainder. The first is that for x˚ to exist,
    f pn`1qpxq must exist and be continuous in the interval rx0, xs. Notice that since, in practice,
    the only thing we will fix is x0, and then use the approximation at different values of x, so this
    means that to be sure that we have a remainder, f pn`1qpxq has to be continuous in the domain
    of the function. The other thing that it allows us to do is to start to understand when a Taylor
    series can be convergent. Remember that we said that it would be convergent when Pn Ñ fpxq
    as n Ñ 8. This is equivalent to saying that Rn Ñ 0 as n Ñ 8. Consider, then, equation (49).
    We see that x´ x0 doesn’t change with n, so that part doesn’t tell us anything about whether a
    series is convergent. We concentrate, then, on the quotient. We see that the denominator goes to
    8 as nÑ 8. So Rn Ñ 0 as nÑ 8 if limnÑ8 f pn`1qpx˚q is a finite number. This will also happen
    if although the pn` 1qth slope at x˚ increases, it increases at lower rate than the denominator.

    Example 24

    To help illustrate the Lagrange remainder, let fpxq “ 1{px´2q. Let’s do a 3rd-order Macluarin
    expansion.

    For the derivatives, it is useful to realize that fpxq “ px ´ 2q´1. Now we find the values of
    the function and the first third derivatives at x “ 0.

    fpxq “ 1
    x´ 2 fp0q “ ´

    1
    2

    f 1pxq “ ´ 1px´ 2q2 f
    1p0q “ ´1

    4

    f2pxq “ 2px´ 2q3 f
    2p0q “ ´2

    8
    “ ´1

    4

    f3pxq “ ´ 6px´ 2q4 f
    3p0q “ ´ 6

    16
    “ ´3

    8

    We can now form the functional form with the remainder. Looking at equation (49), we see
    that we need the fourth derivative evaluated at x “ x˚. We, then, have

    f p4qpxq “ 24px´ 2q5 f
    p4qpx˚q “ 24

    px˚ ´ 2q5 .

    51

    This means that the remainder

    R3 “
    24

    px˚ ´ 2q5 ¨ 4!x
    4 “ x

    4
    px˚ ´ 2q5 .
    We, then, have that

    fpxq

    “ ´1
    2
    ´ x

    4
    ´ x

    2

    4 ¨ 2! ´
    3×3

    8 ¨ 3! `
    x

    px˚ ´ 2q5

    “ ´1
    2
    ´ x
    4
    ´ x
    2

    8
    ´ x

    3
    16
    ` x
    4
    px˚ ´ 2q5 .

    We now have the full functional expansion, so this will always return the same value as the
    original function, for the appropriate x˚. To see this let’s consider when x “ 1. From the
    original function, we have that

    fp4q “ 1
    1´ 2 “ ´1.

    This means that

    ´ 1 “ ´1
    2
    ´ 1

    4
    ´ 1

    8
    ´ 1

    16
    ` 1px˚ ´ 2q5

    ´ 1 “ ´15
    16
    ` 1px˚ ´ 2q5

    ´ 1
    16
    “ 1px˚ ´ 2q5

    px˚ ´ 2q5 “ ´16
    x˚ ´ 2 “ p´16q1{5 “ ´1.74
    x˚ “ ´1.74` 2 “ 0.26

    So, as you can see, we can find the value for x˚ that will have the expansion return the same
    exact value than the original function at a given value for x. The problem is that the value
    for x˚ depends on the value of x at which we want to evaluate the function. This means
    that the value of x˚ “ 0.26 is only valid for the 3rd-order Maclaurin expansion of the original
    function, when x “ 1. For any other order of expansion, or any other value of x, the value
    of x˚ will be different.

    11 Conditions for a Local Maximum or Minimum

    The expansion of a function into a Taylor or Maclaurin series, is going to allow us to develop
    a general test for a local maximum or minimum, one that gives us a condition that is both
    necessary and sufficient for there to be a local maximum, a local minimum, or an inflection point
    at a certain x value. To get to that it is convenient if you realize something. Let’s say we have a
    local maximum at x “ x0. This means that for values of x in the immediate neighborhood of x0

    52

    and to both sides of x0, fpxq ă fpx0q. Similarly, if we have a local minimum at x “ x0, in the
    immediate neighborhood of x0, fpxq ą 0 for all values of x in that neighborhood both the right
    and left of x0.

    y

    x
    x1 x0 x2

    fpx1q fpx0q fpx2q

    O

    y “ fpxq
    y
    x
    x1 x0 x2
    fpx1q fpx0q fpx2q
    O
    y “ fpxq

    paq pbq

    Figure 15: Local Maximum and Minimum

    Figure 15 illustrates the observation we just made. In figure 15a we have a local maximum at
    x0, so for both x1 and x2, fpx1q ă fpx0q and fpx2q ă fpx0q. Figure 15b presents the case of
    a local minimum, where we can see that at x1 and x2, both fpx1q ą fpx0q and fpx2q ą fpx0q.
    Notice, then, that in the immediate neighborhood of x0, fpxq ´ fpx0q ă 0 for a maximum. and
    fpxq ´ fpx0q ą 0 for a minimum.

    If we extend fpxq into an nth-order Taylor series, using the Lagrange form of the remainder we
    would have

    fpxq “ fpx0q`f 1px0qpx´x0q`
    f2px0q

    2!
    px´x0q2`¨ ¨ ¨`

    f pnqpx0q
    n!

    px´x0qn`
    f pn`1qpx˚q
    pn` 1q! px´x0q

    n`1,

    which means that

    fpxq´fpx0q “ f 1px0qpx´x0q`
    f2px0q

    2!
    px´x0q2`¨ ¨ ¨`
    f pnqpx0q
    n!
    px´x0qn`
    f pn`1qpx˚q
    pn` 1q! px´x0q

    n`1. (50)

    The question is how can we determine the sign of fpxq ´ fpx0q from the right hand side of the
    expression in equation (50). Notice that there are n ` 1 terms in that right hand side, n terms
    remaining from Pn, and the remainder. However, we’re trying to determine whether there is
    a maximum or a minimum, so the value of n will depend on the value of the different order
    derivatives evaluated at x0. Let’s consider some cases.

    53

    Case 1: f 1px0q ‰ 0

    In this case we choose n “ 0, so there are no derivatives in place, and we have that

    fpxq ´ fpx0q “ f 1px˚qpx´ x0q.

    The sign of fpxq ´ fpx0q is then, the same as the sign of f 1px˚q times the sign of px´ x0q. Now,
    remember that for a local maximum or minimum x has to be in the immediate neighborhood
    of x0, i.e. it has to be a value very close to x0, whether on the left or on the right. And also
    remember that the x˚ in the remainder has to be between x and x0. This means that x

    ˚ has to
    be even closer to x0. Since f

    1px0q ‰ 0, the sign of the derivative cannot change for values that are
    very close to x0, so f

    1px˚q will have the same value as f 1px0q, and it will not change whether x is
    to the left or to the right of x0. This means that the sign of fpxq ´ fpx0q will change depending
    on whether x is to the left or the right of x0. Clearly if x ą x0, then x ´ x0 ą 0, and if x ă x0
    then x´x0 ă 0. Since f 1px˚q has the same sign in both cases, then the sign of the difference will
    change as me move from the left of x0 to the right of x0. This means that we can’t have either a
    maximum or a minimum at x0 because we saw that for either a maximum or a minimum at x0
    the sign would have to be the same on the immediate neighborhood to both sides of x0.

    4

    Case 2: f 1px0q “ 0, f2px0q ‰ 0

    In this case we choose n “ 1, so that the remainder is based on the second derivative. We then
    would have

    fpxq ´ fpx0q “ f 1px0qpx´ x0q `
    f2px˚q

    2!
    px´ x0q2

    “ 1
    2
    f2px˚qpx´ x0q2.

    Now, we know that f2px˚q will have the same sign as f2px0q, for the same reasons as f 1px˚q had
    the same sign as f 1px0q in the previous case. Notice that now px´x0q2 will always have the same
    sign. This means that fpxq ´ fpx0q will have the same sign in the immediate neighborhood of
    x0 on both sides of x0, and that sign will be that of f

    2px0q. Remember that in that immediate
    neighborhood of x0, fpxq ´ fpx0q ă 0 for a local maximum and fpxq ´ fpx0q ą 0 for a local
    minimum. This means that if f 1px0q “ 0 we will have

    A local maximum of fpxq if f2pxq ă 0
    A local minimum of fpxq if f2pxq ą 0

    rgiven that f 1px0q “ 0s

    This is clearly the sufficient condition for a maximum and a minimum that we saw earlier.
    Remember that this is not a necessary condition, because we could have a maximum and a
    minimum even if the second derivative is zero, which is why we’re deriving this more general
    condition.

    4This is, in fact, proof that in order for a differentiable function to have a local maximum or a minimum at
    x0 it is necessary that its first derivative equals zero.

    54

    Case 3: f 1px0q “ f2px0q “ 0, f3px0q ‰ 0

    In this case we choose n “ 2, again the order of the last derivative that is equal to zero. We thus
    have that

    fpxq ´ fpx0q “ f 1px0qpx´ x0q `
    f2px0q

    2!
    px´ x0q2 `

    f3px˚q
    3!

    px´ x0q3

    “ 1
    6
    f3px˚qpx´ x0q3,

    because f 1px0q “ f2px0q “ 0. We know that f3px˚q will have the same sign as f3px0q no matter
    on which side of x0 we are in its immediate neighborhood, but since x ´ x0 is raised to an odd
    power the sign will change as we move from the left of x0 to the right of x0. We, therefore, have
    neither a local maximum or a local minimum at x0. However, since f

    1px0q “ 0, we know that
    this is a critical point. Since at this point we have neither a local maximum or a local minimum
    we have an inflection point.

    11.1 General Test at a Critical Point

    From the three cases we have just considered we can see a pattern rising. First we now know
    why in order for there to be a local maximum or minimum, as well as an inflection point, it’s
    necessary for f 1px0q “ 0. We have also seen how we can determine the sign of fpxq ´ fpx0q by
    setting the order of the Taylor series to that of the last derivative that is equal to zero, the sign
    of fpxq ´ fpx0q being equal on both sides of x0 depends on whether px ´ x0qn`1 is raised to an
    even power or an odd power, i.e. on whether n`1 is odd or even. If it’s odd there will be neither
    a maximum or a minimum, and if it’s even whether we have a maximum or a minimum depends
    on the sign of f pn`1qpx0q. We thus have the following general test:

    For a function fpxq whose first nonzero derivative at x0, f pNqpx0q, is at a value N ą 1,
    then fpx0q will be

    a. a local maximum if N is even and f pNqpx0q ă 0,

    b. a local minumum if N is even and f pNqpx0q ą 0,

    c. an inflection point if N is odd.

    This is a general way of finding a local minimum, a local maximum or an inflection point. Notice
    that what is critical is that f 1px0q “ 0, so we will still use that to find at which value we must
    evaluate all other derivatives.

    Example 25

    Examine the function y “ p7´ xq4 for its local extremum.

    Letting y ” fpxq, we know that in order to have a local maximum or minimum we need that

    55

    f 1px0q “ 0, so to find x0 we equal the first derivative to zero and solve. Therefore,

    f 1pxq “ ´4p7´ xq3.

    This will equal zero at x0 “ 7; The rest of the derivatives are

    f2pxq “ 12p7´ xq2 f2p7q “ 0
    f3pxq “ ´24p7´ xq f3p7q “ 0
    f p4qpxq “ 24 f p4qp7q “ 24.

    We, then, have that order of the first derivative that is different from zero is 4, so it’s even
    and we can have either a minimum or a maximum at x “ 7. Since f p4qp7q ą 0, and 4 is even,
    we have a local minimum at x “ 7.

    12 Homework Problems

    Problem 1

    Consider a monopolist that faces an inverse demand function P pQq “ 100 ´ 2Q, and a cost
    function of CpQq “ 100` 20Q.

    (a) Write the revenue function and the profit function.

    (b) Write the marginal revenue function and the marginal cost function.

    (c) At what output is profit maximized, Q˚?

    (d) Check the second order condition to confirm that at Q “ Q˚, we have indeed a maximum.

    (e) What is the optimal level of revenue, R˚, cost, C˚, and profit, π˚?

    Problem 2

    Consider the function y “ 23x´2×2 .

    (a) What is the general expression of the elasticity in terms of x?

    (b) What value does the elasticity actually have when x “ 4?

    Problem 3

    Suppose the cost function of producing Q ą 0 units of a commodity is CpQq “ aQ2 ` bQ ` c,
    where a, b, and c are all constants.

    (a) Find the critical value of Q that minimizes the average cost function, ACpQq “ CpQq{Q
    (this is called the minimum efficient scale in microeconomics).

    56

    (b) Find the marginal cost function MCpQq “ dCpQq{dQ, and show that MCpQq “ ACpQq
    at the critical value of Q you found in part (a).

    Problem 4

    Consider that a person has a utility of money, x, Upxq “ lnp1`0.5xq. For simplicity assume that
    we cannot have negative money, i.e. he can’t borrow, so that x ě 0. He is offered to enter a bet
    where there are two possible payouts, $5 with a probability of 0.25, and $25 with a probability
    of 0.75.

    (a) Is this a risk averse, risk neutral, or risk loving individual? How do you know?

    (b) If this were a fair game, what would the cost of the bet be?

    (c) How much should this bet cost so that this particular individual would be indifferent
    between making the bet or not? (Hint : remember that an individual is indifferent when
    the utilities of the two options are the same.)

    (d) Is the cost in part (c) higher or lower than if the bet was a fair game? Does this have to
    do anything with the person’s attitude towards risk that you mentioned in part (a)?

    Problem 5

    Consider that we have a plantation of pines that currently have a value of $5,000. The value
    grows at a continuous rate of 4t1{4.

    (a) Write the expression for the present value, PV , of the plantation in terms of t and the
    interest rate, r.

    (b) Write the expression for the optimal time, t˚, to cut and sell the pine timber as a function
    of the interest rate, r.

    (c) Check the second-order condition for a maximum at the optimal value of t˚. Does it hold,
    knowing that r ą 0?

    (d) Assume that r “ 0.04. What is the value of t˚ and of PV ˚?

    Problem 6

    Consider the function fpxq “ 2{p3x` 1q.

    (a) We’re going to consider a 2nd-order Taylor expansion around the point x “ 2. What is the
    2nd-order polynomial that approximates fpxq? That is, find the expression for P2 in this
    case.

    (b) What is the general form of the Lagrange remainder for this case? That is, find the
    expression for R2 in this case. (Hint : this is a function of x and x

    ˚.)

    57

    (c) Now consider that x “ 4. What is the value of fp4q?

    (d) What is the value of P2 that you found in part (a) when evaluated at x “ 4?

    (e) What is, then, the value of the remainder R2, when x “ 4? (Hint : This is an actual number
    not a function of x˚.)

    (f) What is the value of x˚ that will make the function and the full expansion have the same
    value at x “ 4?

    Problem 7

    Consider the function y “ rpx´ 7qxs2.

    (a) At what value(s) of x is it possible that we have a local maximum or minimum, or an
    inflection point?

    (b) Use the general test we have seen in section 11.1 to determine whether we have a maximum,
    minimum, or inflection point, for each of the critical values you found in part (a).

    58

    References

    Macaulay, Frederick R., Theoretical Problems Suggested by the Movements of Interest Rates,
    Bond Yields and Stock Prices in the United States since 1856, Cambridge, MA USA: National
    Bureau of Economic Research, 1938.

    59

      Functions
      Revenue, Cost, and Profit
      Change and Rate of Change, Differential and Derivative
      Differential and Derivative
      Derivative Rules
      Constant Rule
      Power Rule
      Logarithm Rule
      Exponential Rule
      Summation Rule
      Generalized Power Rule
      Product of Constant and Function Rule
      Difference Rule
      Product Rule
      Quotient Rule
      Chain Rule

      The Elasticity
      Continuous, Differentiable, and Continuously Differentiable Functions
      Marginal and Average Functions
      Marginal and Average Revenue
      Marginal and Average Cost
      Higher Order Derivatives
      The Second Derivative and Concavity and Convexity
      Attitude Towards Risk
      Maxima and Minima
      Conditions for a Local Maximum or Minimum
      Profit Maximization
      Optimal Timing
      Approximating Functions
      Macluarin Series
      Taylor Series
      The Mean Value Theorem
      The Lagrange Form of the Remainder
      Conditions for a Local Maximum or Minimum
      General Test at a Critical Point
      Homework Problems
      References

    Calculate your order
    Pages (275 words)
    Standard price: $0.00
    Client Reviews
    4.9
    Sitejabber
    4.6
    Trustpilot
    4.8
    Our Guarantees
    100% Confidentiality
    Information about customers is confidential and never disclosed to third parties.
    Original Writing
    We complete all papers from scratch. You can get a plagiarism report.
    Timely Delivery
    No missed deadlines – 97% of assignments are completed in time.
    Money Back
    If you're confident that a writer didn't follow your order details, ask for a refund.

    Calculate the price of your order

    You will get a personal manager and a discount.
    We'll send you the first draft for approval by at
    Total price:
    $0.00
    Power up Your Academic Success with the
    Team of Professionals. We’ve Got Your Back.
    Power up Your Study Success with Experts We’ve Got Your Back.

    Order your essay today and save 30% with the discount code ESSAYHELP