The Hasse derivative.

Abstract.

In real and complex analysis, the Taylor series expansion is a very important tool. For polynomials over arbitrary unitary rings, it is possible to define a derivative which behaves similar to the usual derivative; unfortunately, the Identity Theorem and Taylor’s formula do not transfer to this new situation. Fortunately, there exists a different definition of derivatives for these cases, namely the Hasse derivative. Not only does it gives a Identity Theorem and Taylor’s formula back, but also allows to write other identities in a simpler way.

In real and complex analysis, one has a powerful tool, namely the Taylor expansion, which expands an analytic function into a power series. In algebra, one can define the derivative of a polynomial aswell; for f = \sum_{i=0}^n a_i x^i \in R[x], R being a unitary ring, define f' := \sum_{i=1}^n i a_i x^i \in R[x]. This satisfies the same rules as the usual derivative, for example K-linearity and the product rule (f g)' = f' g + f g'. One can also define f^{(k)} recursively by f^{(0)} = f, f^{(k + 1)} = (f^{(k)})' for k \in \N. Unfortunately, one looses certain properties; for example, if R is of finite characteristic m > 0, the polynomial f = x^m \in R[x] satisfies f' = 0, but is not constant as f(0) = 0 \neq 1 = f(1) (assuming R is not the zero ring). In particular, the Identity Theorem does not work. This example also shows one problem with a possible Taylor expansion: for that, one needs to compute \frac{f^{(i)}(a)}{i!} for i = 0, \dots, \deg f; but m = 0 in R, whence m! has no inverse in R! Hence, a Taylor expansion in the classical sense cannot be defined. A “fix” for this problem is offered by Hasse derivatives: they are defined to make both the Identity Theorem and Taylor expansions work again.

Definition.
Let f = \sum_{i=0}^n a_i x^i \in R[x] and k \in \N. Define \displaystyle  D^{(k)} f := \sum_{i=k}^n \binom{i}{k} a_i x^{i - k} \in R[x]. The function D^{(k)} : R[x] \to R[x] is called the k-th Hasse derivative.

The Hasse derivative shares several properties with the usual derivative, but not all of them; for example, D^{(k)} D^{(\ell)} \neq D^{(k+\ell)} in general. But we have the following properties:

Theorem.
Let f, g \in R[x] and \lambda \in R, and k, \ell \in \N.
  1. We have that D^{(k)} is R-linear, i.e. D^{(k)} (f + g) = D^{(k)} f + D^{(k)} g and D^{(k)}(\lambda f) = \lambda D^{(k)} f.
  2. We have k! \cdot D^{(k)} f = f^{(k)}; in particular, D^{(1)} f = f'.
  3. We have D^{(k)} D^{(\ell)} f = \binom{k + \ell}{\ell} D^{(k+\ell)} f.
  4. (Leibniz Rule) We have \displaystyle  D^{(k)}(f g) = \sum_{i=0}^k D^{(i)} f \cdot D^{(k-i)} g; more generally, for f_1, \dots, f_t \in R[x], we have \displaystyle  D^{(k)} \prod_{i=1}^t f_i = \sum_{m_1 + \dots + m_t = k} \prod_{i=1}^t D^{(m_i)} f_i, where the sum goes over all such tuples (m_1, \dots, m_t) \in \N^t.
  5. (Faà di Bruno’s Formula) We have \displaystyle  D^{(k)} (f \circ g) = \sum \binom{n}{c_0, c_1, \dots, c_k} (D^{(c_0)} f) \circ g \cdot \prod_{j=1}^k (D^{(j)} g)^{c_j}, where the sum goes over all tuples (c_0, \dots, c_k) \in \N^{k+1} with \sum_{i=0}^k c_i = n and \sum_{i=0}^k i c_i = k; here, \binom{n}{c_0, c_1, \dots, c_k} is a multinomial coefficient having the value \displaystyle  \frac{n!}{c_0! \cdot c_1! \cdots c_k!}.
  6. (Taylor Formula) We have \displaystyle  f = \sum_{i=0}^{\deg f} (D^{(i)} f)(\lambda) (x - \lambda)^i.
  7. (Identity Theorem) If we have (D^{(i)} f)(\lambda) = 0 for all i \ge 0, then f = 0.

For that reason, one can define \frac{f^{(k)}}{k!} := D^{(k)} f, so that we can write Taylor’s formula in a more tempting form as \displaystyle  f = \sum_{i=0}^n \frac{f^{(i)}}{i!}(\lambda) (x - \lambda)^i, which almost equals the classical form.

Note that the Leibniz rule, Faà di Bruno’s formula and Taylor’s formula take simpler forms than their classical counterparts; this is due to the fact that the additional factorial terms or binomial coefficients already hide in the definition of the Hasse derivative.

Proof.
  1. This follows from the definition of D^{(k)}.
  2. Write f = \sum_{i=0}^n a_i x^i; then k! \cdot D^{(k)} f ={} & k! \sum_{i=k}^n \binom{i}{k} a_i x^{i - k} = \sum_{i=k}^n k! \binom{i}{k} a_i x^{i-k} \\ {}={} & \sum_{i=k}^n a_i \cdot i (i - 1) (i - 2) \cdots (i - k + 1) x^{i - k} \\ {}={} & \sum_{i=k}^n a_i (x^i)^{(k)} = f^{(k)}.
  3. By 1., it is suffices to show this for f = x^i, i \ge k + \ell (for smaller i, both sides will be zero). We have \displaystyle  D^{(k)} D^{(\ell)} f = D^{(k)} \binom{i}{\ell} x^{i - \ell} = \binom{i - \ell}{k} \binom{i}{\ell} x^{i - k - \ell} and \displaystyle  D^{(k + \ell)} f = \binom{i}{k + \ell} x^{i - k - \ell}. But since \binom{k + \ell}{\ell} \binom{i}{k + \ell} ={} & \frac{(k + \ell)!}{\ell! k!} \frac{i!}{(k + \ell)! (i - k - \ell)!} \\ {}={} & \frac{i!}{\ell! k! (i - k - \ell!)} \\ {}={} & \frac{(i - \ell)!}{k! (i - k - \ell)!} \frac{i!}{\ell! (i - \ell)!} = \binom{i - \ell}{k} \binom{i}{\ell}, these terms are equal.
  4. Note that if we fix f, we get an R-linear function R[x] \to R[x], g \mapsto D^{(k)} (f g). Hence, it suffices to show this for arbitrary f \in R[x] and g = x^m, m \in \N. By the same argument, for g = x^m, we get an R-linbear function R[x] \to R[x], f \mapsto D^{(k)} (f x^m); therefore, it suffices to consider f = x^n, n \in \N. But now,  D^{(k)} (x^n x^m) ={} & D^{(k)} x^{n + m} = \binom{n + m}{k} x^{n + m - k} \\ \text{and} \quad D^{(i)} x^n \cdot D^{(k-i)} x^m ={} & \binom{n}{i} x^{n - i} \binom{m}{k - i} x^{m - (k - i)} \\ {}={} & \binom{n}{i} \binom{m}{k - i} x^{n + m - k}. Hence, it suffices to show \sum_{i=0}^k \binom{n}{i} \binom{m}{k - i} = \binom{n + m}{k}. By reorganizing the binomial coefficients, one transforms this into the equality \displaystyle  \sum_{i=0}^k \binom{k}{i} \binom{n + m - k}{n - i} = \binom{n + m}{n}, which is Vandermonde’s Identity and, hence, true.
    The more general equation is shown by induction on t. For t = 1, we have \sum_{m_1 + \dots + m_t = k} \prod_{i=1}^t D^{(m_i)} f_i = D^{(k)} f_1. Now, assume that the equation is true for all k for one t \ge 1. Then, for any k,  & D^{(k)} \prod_{i=1}^{t+1} f_i = D^{(k)} \biggl( f_{t+1} \cdot \prod_{i=1}^t f_i \biggr) \\ {}={} & \sum_{m_{t+1} = 0}^k D^{(m_{t+1})} f_{t+1} \cdot D^{(k-m_{t+1})} \biggl( \prod_{i=1}^t f_i \biggr) \\ {}={} & \sum_{m_{t+1} = 0}^k D^{(m_{t+1})} f_{t+1} \cdot \sum_{m_1 + \dots + m_t = k - m_{t+1}} \prod_{i=1}^t D^{(m_i)} f_i by the Leibniz rule and by the induction hypothesis; here, the second sum goes over all such tuples (m_1, \dots, m_t) \in \N^t. But this equals \sum_{m_1+\dots+m_t+m_{t+1}=k} \prod_{i=1}^{t+1} D^{(m_i)} f_i, what we had to show.
  5. Again, by 1., it suffices to show this for f = x^n, n \in \N. Now, by the second part of 4., \displaystyle  D^{(k)}(g^n) = D^{(k)}(g \cdots g) = \sum_{m_1 + \dots + m_n = k} \prod_{i=1}^n D^{(m_i)} g, where the sum goes over all such (m_1, \dots, m_n) \in \N^n. The formula we want is now obtained by sorting the summands by the different powers of D^{(i)} g appearing, 0 \le i \le k.
    Consider the map \displaystyle  \varphi : \N^n \to \N^{k+1}, \quad m = (m_1, \dots, m_n) \mapsto (c_0(m), \dots, c_k(m)), there c_i(m) := \abs{\{ j \in \{ 1, \dots, n \} \mid m_j = i \}}, 0 \le i \le k. Now, if m = (m_1, \dots, m_n) \in \N^n satisfies \sum_{i=1}^n m_i = k, then \sum_{j=0}^k j c_j(m) = k and \sum_{j=0}^k c_j(m) = n. Now, for a fixed (c_0, \dots, c_k) \in \N^{k+1} with \sum_{i=0}^k c_i = n and \sum_{i=0}^k i c_i = k, the \abs{\varphi^{-1}(c_0, \dots, c_k)} equals the multinomial coefficient \displaystyle  \binom{n}{c_0, c_1, \dots, c_k}, whence we get that the above formula for D^{(k)}(g^n) equals \displaystyle  \sum \binom{n}{c_0, c_1, \dots, c_k} g^{c_0} \cdot \prod_{j=1}^k (D^{(j)} g)^{c_j}, where the sum goes over all tuples (c_0, c_1, \dots, c_k) \in \N^k with \sum_{i=0}^k i c_i = k and \sum_{i=0}^k c_i = n.
  6. By 1., it suffices to show this for f = x^n, n \in \N. Now \sum_{i=0}^{\deg f} (D^{(i)} f)(\lambda) (x - \lambda)^k ={} & \sum_{i=0}^n \biggl(\binom{n}{i} x^{n - i}\biggr)(\lambda)  (x - \lambda)^i \\ {}={} & \sum_{i=0}^n \binom{n}{i} \lambda^{n-i} (x - \lambda)^i \\ {}={} & ((x - \lambda) + \lambda)^n = x^n by the Binomial Theorem, what we had to show.
  7. This follows directly from 6.

Tags: , , , , ,

Three responses to “The Hasse derivative.”

  1. Alexey Maevskiy says:

    Hi! Perhaps it may be useful to place here some facts about mixed partial Hasse derivatives of polynomials from R[x_1,\ldots,x_n]. As for me I worked with polynomials from R[x_1,x_2,x_3] and had to spend a lot of time proving some needed results on its Hasse derivatives. In particular, one of the popular fact is Taylor Formula.

  2. Hi, sorry for not replying earlier, I was not available the last weeks. This sounds interesting, and I will write up something about that. Thanks for the suggestion!
    Maybe a question for you: are there other facts about (mixed partial) Hasse derivatives which you think are worth showing here?

  3. I wrote a first article on partial Hasse derivatives, which can be found here.

Leave a Reply