Mathematics desk
< January 13	<< Dec \| January \| Feb >>	January 15 >

Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

January 14

Is there a complete axiom system for 3d real vectors without mentioning coordinates?

Suppose first that you want to prove identities of real numbers with addition and multiplication, such as the following.

(Q0) 4(ps+qr)² - (p² - q² - r² + s²)² = (-p+q+r+s)(p-q+r+s)(p+q-r+s)(p+q+r-s)

It is well-known how to do this mechanically: you expand all products of sums to sums of products, then group terms with the same variables together and see if the coefficient of all terms is zero. What's important for me is that this algorithm not only decides whether an equation is an identity, but that if it is and identity, the algorithm can give a proof that consists of only formal manipulation of the expression using some axioms. (For the purposes of this question, I also don't care about the fact that this algorithm takes exponential time.)

The ring axioms, and precisely how prove all polynomial identities with them

More precisely, suppose we are working with expressions built from formal variables, ~~(integer constants)~~ rational constants, addition and multiplication. We want to prove that such an expression is equal to zero for any choice of real numbers substituted to the variables. We want a formal proof that starts from the expression, ends in zero. Each step of the proof replaces a subexpression with another subexpression where these subexpressions are specializations of any two sides one of the following axioms (R) with the variables p, q, r replaced by any expression.

(R0) m + n = k for all ~~(integer constants)~~ rational constants m, n, k where the equation is true
(R1) mn = k for all ~~(integer constants)~~ rational constants m, n, k where the equation is true
(R2) p + (q + r) = (p + q) + r
(R3) 0 + p = p + 0 = p
(R4) p + q = q + p
(R5) p(qr) = (pq)r
(R6) 0p = p0 = 0
(R7) 1p = p1 = 1
(R8) p(q + r) = pq + pr
(R9) (p + q)r = pr + qr
(R10) pq = qp

The well-known way to find a proof of this form works like this. First, you apply (R8) and (R9) forwards (that is, finding a specialization of the left side and replacing it with the right side) while you can, thus eliminating all products, meanwhile possibly using (R5) and (R10) in either direction to rearrange products of more than two factors such that the factors that are sums get next to each other. Your expression should now be a sum of one or more terms, none of which contains addition. Rearrange each such term using (R5) and (R10) and (R1) to bring them to a form mf or m·1 where m is ~~(an integer constant)~~ a rational constant and f is a product of variables (you may also need to apply (R7) backwards for terms that have only variables and no constants). In each of these terms, rearrange the product f of variables to a canonical way, such as a left-associated product of the variables in some consistent order, using (R5) and (R10). Now reorder the terms so that terms with the same multiset of variables get together in a sum, using (R2) and (R4). In each of these sums, the sum of coefficients must vanish, or else the expression can't be constant zero. If they do vanish in a sum, repeatedly apply (R9) to factor the terms to a form mf where m is a sum of constants and f is the common product of variables; then use (R1) to compute m to a single constant zero. Finally, use (R6) on each term to turn it to zero, and (R0) to vanish the whole expression.

What I'm concernred about is not identities of real numbers, but identities of three dimensional real vectors using addition, multiplication with a scalar, inner product, and cross product, plus addition and multiplication of real numbers (which can arise from inner products for example).

Here is one example of a complicated identity of this form (an algebraic form of the Pappos-Pascal theorem).

(Q1) 0 = (((a×e)×(b×f))×c) + (((b×e)×(c×f))×a) + (((c×e)×(a×f))×b)

Now my question is whether all such identities have a proof using only formal applications of the axioms of these vector operations. If there is, is there also an algorithm that always finds such a proof; and is there such an algorithm that is also easy to describe? There's also the question of what axioms you need.

The vector axioms I propose

More precisely, we consider expressions that's built of ~~(integer constants)~~ rational constants, the null vector constant, scalar variables, vector variables, addition and multiplication of scalar expressions, addition of vector expressions, multiplication of a scalar expression with a vector expression, inner product of two vector expressions, and cross product of two vector expressions. We want to decide whether such an expression, which could be either vector valued or scalar valued, is zero for any values of real numbers and three dimensional real vectors substituted to the scalar and vector variables respectively.

I propose the following set of axioms (V,T) in addition to the above axioms (R).

(V0) a + (b + c) = (a + b) + c
(V1) 0 + a = a + 0 = a
(V2) a + b = b + a
(V3) (p + q)a = pa + qa
(V4) p(a + b) = pa + pb
(V5) (pq)a = p(qa)
(V6) 0a = 0
(V7) 1a = a

(T0) (a + b)c = ac + bc
(T1) a(b + c) = ab + ac
(T2) (pa)b = a(pb) = p(ab)
(T3) ab = ba
(T4) (a + b)×c = a×c + b×c
(T5) a×(b + c) = a×b + a×c
(T6) (pa)×b = a×(pb) = p(a×b)
(T7) b×a = (-1)(a×b)
(T8) a×(pa) = (pa)×a = 0
(T9) (a×b)c = a(b×c)
(T10) (a×b)×c = (ac)b + (-1)((bc)a)

Some useful consequences derivable from the asioms (R,V,T) follow.

(L0) 0a = a0 = 0
(L1) 0×a = a×0 = 0
(L2) a×(b×c) = (ac)b + (-1)((ab)c)
(L3) (a×b)a = (a×b)b = a(a×b) = b(a×b) = 0
(L4) (a×b)c = (b×c)a = (c×a)b = (-1)((a×c)b) = (-1)((c×b)a) = (-1)((b×a)c)
(L5) (a×b)(c×d) = (ac)(bd) + (-1)((ad)(bc))
(L6) (a×b)×c + (b×c)×a + (c×a)×b = 0

Allow me to summarize of the above axioms outside the collapse box.

(V) The 3-vectors form a vector space over reals
(T0…3) Scalar product is bilinear and commutative
(T4…7) Cross product is bilinear and anticommutative
(T8) The cross product of a vector and scalar times that vector is zero
(T9) The triple product law: (a×b)c = a(b×c)
(T10) The vector triple product expansion law: (a×b)×c = (ac)b + (-1)((bc)a)

At first glance, one may think that, similar to real valued expression, there's an easy method to simplify any identically zero expression to zero, but this is not so. Let me explain why.

Why brute force expansion doesn't work on vector identities

Just like with scalar expressions, you start by expanding all products of sums, which is clearly possible.

The next step is to eliminate most of the cross products. This is possible, because if any term has more than one cross product in it, you can eliminate at least one of them. Indeed, after removing all the multiplications of vectors with a scalar, you are left with either a cross product one of whose factors is a cross product, which you simplify with the identity (T10), or you have the inner product of two cross products, which you simplify with the (L5) identity from above.

If, for simplicity, we suppose that the original expression is scalar, then what we're left is essentially a linear combination of products of scalar products of variables, such as (ab)(cd)(ef) - 2(ac)(be)(df). You'd think it's easy to finish form there, isn't it?

(Update: actually we're left with a polynomial of scalar products and triple products of vector variables, and of scalar variables. – b_jonas 11:22, 16 January 2012 (UTC))[reply]

But the catch is, such a sum can be identically zero even if none of the terms are repeated. Indeed, suppose k is sufficiently large, and n is much larger than k. Fix n vector variables, and consider all k-wise products made of scalar products where the 2k variables are different. There are about n^(2k)/(k!2^k) such products (up to commutativity of the product and commutativity of each scalar product). However, any of these is in the vector space spanned by the (2k)-wise product of coordinates of the n vector variables, and this vector space (of functions of 2n variables, as a vector space over reals) has dimension approximately (3n)^(2k)/(2k)!. This dimension is much less than the number of such formal product of scalar products, so there must be a linear combination of these products that is identically zero (and it's easy to see there's also one with integer coefficients).

This means there's a nontrivial identity made of only scalar products of vectors and multiplication and addition of reals. Clearly the axioms (R,V,T0-9) are not enough to transform such an expression to zero. It might be possible to prove such an identity by reintroducing vectorial products by applying (T10) (or (L5)) backwards, but if it is so, I don't see why that would work in general.

In fact, I don't even know a proof of (Q1) this way, though it's possible there is one. If it bothers you that that's a vector expression, not a scalar expression, try this instead.

(Q1') 0 = ((((a×e)×(b×f))×c) × (((b×e)×(c×f))×a)) · (((c×e)×(a×f))×b)

Now of course it's possible that there is still a way to reduce any such identity to zero using the axioms (S,V,T), and possibly even an algorithm to find it, only I don't know how to do it. Or maybe you need just a few more axioms that cover all cases, though I can't imagine what they would be like. Maybe you even need to add more operators that are used in intermediate expressions of the derivation, just like how (T10) can add cross products to expressions using only dot products, but you have to be a bit careful with what you want to allow, because adding functions that get the coordinates of the vectors is exactly what I want to avoid. It's of course also possible that you just can't prove identities of 3-vectors this way. Which is the case?

Of course, everyone also knows a way to prove all these vector identities in a different way: this is using coordinates. The method is the following: replace each vector variable a with a vector of three scalar variables (a₀, a₁, a₂). Compute all the vector additions, scalings, inner products and cross products from the coordinates using the following axioms.

Computing with 3-vectors coordinatewise
(A0) (a₀, a₁, a₂) + (b₀, b₁, b₂) = (a₀ + b₀, a₁ + b₁, a₂ + b₂) (A1) p(a₀, a₁, a₂) = (pa₀, pa₁, pa₂) (A2) (a₀, a₁, a₂)(b₀, b₁, b₂) = a₀b₀ + a₁b₁ + a₂b₂ (A3) (a₀, a₁, a₂)×(b₀, b₁, b₂) = (a₁b₂ + (-1)(a₂b₁), a₂b₀ + (-1)(a₀b₂), a₀b₁ + (-1)(a₁b₀))

Finally, prove the resulting scalar identity using the method for polynomial identities; or prove the identity of each of the three coordinates this way if the original formula was a vector identity.

Besides expanding to coordinates, there are other method you can use if you want to prove a vector identity by hand, but that don't reduce to just replacing subexpressions of the expressions using axioms.

Ways to “cheat”, other than using coordinates

One of these is that instead of proving a vector identity 0 = f(a₁, …, a_n), you prove the scalar identity 0 = b·f(a₁, …, a_n) where b is a new independent vector variable. If the latter is true for all b, then the former must be true as well. In fact, you don't even need to multiply with a variable independent from f: instead you can prove that the three inner products af, bf, cf are all identically zero, where f can depend on a, b, c, but the three expressions a, b, c are not lineraly dependent in general, which is satisfied if for example they are three independent variables, or if a and b are independent variables and c = a×b. This technique can at least help make proofs simpler in practice, but I don't know how much it can help in theory. In any case, this alone won't solve all problems because of the above mentioned linear combination of product of scalar products expression.

Another method is to introduce division by scalars in the expressions, where the denominator is such an expression which may be zero for some values, but of which you can prove is nonzero in the general case. For example, you could have expressions like ab/ac or ab/a², or even allow square roots of generally positive expressions and have expressions like $\mathbf {a} /{\sqrt {\mathbf {a} ^{2}}}$ . This is a valid method of proof, because if the original identity is true for general choices of the variables (say, almost all choices, or choices with the coordinates independent over the rationals), then it must be true for all values. I'm not quite sure, but I think using these kinds of expressions is effectively as strong as using coordinates.

In any case, if possible I'd prefer to see a mechanical method of proving identities without tricks like this, using only axioms, and without involving coordinates.

Final remarks. This question comes from this discussion of vector identities on the KöMaL forum. Also, sorry for any typos in the formulas above, and for the lengthy post.

– b_jonas 16:12, 13 January 2012 (UTC)[reply]

Axiom T8 follows from the earlier axioms, so it isn't necessary. It's not clear that axiom T10 is sufficient. You need something to guarantee that if

e_{1},e_{2},e_{3}

is a positively oriented orthonormal basis, then

(e_{1}\times e_{2})\cdot e_{3}=1

. With this and the other axioms, T10 follows as a consequence. Sławomir Biały (talk) 12:20, 14 January 2012 (UTC)[reply]

Responses

The problem is, if you added axioms to make sure e_0, e_1, e_2 is an orthonormal basis, eg. by defining their length and pairwise scalar and cross products, then you'd probably be able to prove a = (ae_0)e_0 + ... + (ae_2)e_2, and do coordinate computations with vectors of form x_0e_0 + ... + x_2e_2 like in (A). – b_jonas 17:32, 14 January 2012 (UTC)[reply]

Actually, having thought about it, I think T10 is enough. You just need something to normalize the cross product against the other "cross products" that satisfy the remaining axioms. Sławomir Biały (talk) 21:31, 14 January 2012 (UTC)[reply]

I don't think you can normalize anything, because you can't use divisions. You might be able to use an unnormalized but orthogonal triplet of vectors, but I'm not really sure if that helps enough. – b_jonas 08:52, 15 January 2012 (UTC)[reply]

I mean here that something must distinguish this cross product from the "cross product" that satisfies

(e_{1}\times e_{2})\cdot e_{3}=s

for some

s\not =1

. This is a normalization condition on the cross product: it says that the cross product is "compatible" with the scalar product. But I think T10 already does this, so there is nothing to worry about. (Incidentally, I really don't understand your hangup about division.) Sławomir Biały (talk) 02:21, 16 January 2012 (UTC)[reply]

What might help is if I found a nontrivial product of scalar products identity of the kind I've mentioned, preferably a minimal one. I'll try to do that later either by mechanically unravelling (Q1) or some similar identity, or by finding the minimal n and k that works and solving a linear equation. – b_jonas 17:36, 14 January 2012 (UTC)[reply]

Here's one of those identities I mentioned.

0 = - 2(ab)(ac)(bd)(cd) - 2(ab)(ad)(bc)(cd) - 2(ac)(ad)(bc)(bd) - (aa)(bb)(cd)(cd) - (aa)(bc)(bc)(dd) - (aa)(bd)(bd)(cc) - (ab)(ab)(cc)(dd) - (ac)(ac)(bb)(dd) - (ad)(ad)(bb)(cc) + (aa)(bb)(cc)(dd) + (ab)(ab)(cd)(cd) + (ac)(ac)(bd)(bd) + (ad)(ad)(bc)(bc) + 2(aa)(bc)(bd)(cd) + 2(ab)(ac)(bc)(dd) + 2(ab)(ad)(bd)(cc) + 2(ac)(ad)(bb)(cd).

I don't really understand this one, I just found it with a computer. In case it it helps, the corresponding identity for two-dimensional real vectors is 0 = - 2(ab)(ac)(bc) - (aa)(bb)(cc) + (aa)(bc)(bc) + (ab)(ab)(cc) + (ac)(ac)(bb).

– b_jonas 19:28, 16 January 2012 (UTC)[reply]

I found a general identity. For dimension 2 vectors, this is an identity: 0 = (ad)(be)(cf) + (ae)(bf)(cd) + (af)(bd)(ce) - (af)(be)(cd) - (ae)(bd)(cf) - (ad)(bf)(ce). In general, if you want an identity for dimension d vectors, take 2(d+1) vector variables, called a_0 .. a_d, b_0 .. b_d. The identity is 0 = sum_pi (-1)^pi prod_i (a_i b_{pi(i)}), where the sum goes over all permutations pi of {0 .. d}, and (-1)^pi is the sign of that permutation. – b_jonas 20:17, 16 January 2012 (UTC)[reply]

Słavomir Biały: are you sure T8 follows from earlier axioms? We don't have fractions so you can't just write axa = (1/2 + 1/2)axa = (axa)/2 + (axa)/2 = [from T7] (axa - axa)/2 = ((1-1)axa)/2 = 0/2 = 0. In fact, this might be a problem with this axiom system. Maybe we should add rational constants. – b_jonas 17:46, 14 January 2012 (UTC)[reply]

Yes, T8 follows from bilinearity and anticommutativity. (Bilinearity means real bilinear since we're on a real vector space, so yes there are fractions.) Sławomir Biały (talk) 21:31, 14 January 2012 (UTC)[reply]

I've changed my system to allow rational constants instead of integers. I think that doesn't break anything, and fixes at least this problem. – b_jonas 08:39, 15 January 2012 (UTC)[reply]

You need to allow homogeneity with respect to real scalars if you want to get anything intelligible. Otherwise

\mathbb {R} ^{3}

is infinite-dimensional (as a vector space over

\mathbb {Q}

), and there will be scalar products such that

\mathbf {a} \cdot (p\mathbf {b} )\not =p\mathbf {a} \cdot \mathbf {b}

for real p. Similarly with the cross product. Sławomir Biały (talk) 12:43, 15 January 2012 (UTC)[reply]

I think axiom (T2) gives that. Similarly (T6) for the cross product. – b_jonas 15:09, 15 January 2012 (UTC)[reply]

Then I'm confused why you seem to think that real scalings are allowed in some places but not others. Sławomir Biały (talk) 17:40, 15 January 2012 (UTC)[reply]

final step in proof

Salut, salve, hola! (and hi, of course:) I am trying to derive the infinite series for $e^{x}$ at x=1. I am starting from the definition of $e=\lim _{n\to \infty }\left(1+{\frac {1}{n}}\right)^{n}$ . I would like to expand the rhs side by the binomial theorem / Pascal's triangle, but this is only valid if n is a natural (unless I consider formal power series, and I do not plan to). If I expand the rhs side assuming n approaches infinity in the naturals, by what argument can I say that the value of this expansion is the same as the original expression had I made no such assumption on n? I know it is right, of course, but how would I prove it? ``` — Preceding unsigned comment added by 200.60.11.20 (talk) 19:47, 14 January 2012 (UTC)[reply]

See our article on the binomial series. The binomial series is more than just a formal power series, it is a convergent series provided certain conditions are met. — Fly by Night (talk) 20:47, 14 January 2012 (UTC)[reply]

You can show that the function

f(x)=\left(1+{\frac {1}{x}}\right)^{x}

is increasing and bounded above, so the limit exists. Then you are allowed to compute the limit sequentially, by specifying integer values of x. Sławomir Biały (talk) 21:34, 14 January 2012 (UTC)[reply]

but why am I allowed to compute it sequentially? 200.60.11.20 (talk) —Preceding undated comment added 22:19, 14 January 2012 (UTC).[reply]

If

\lim _{x\to \infty }f(x)=L

, then

\lim _{n\to \infty }f(n)=L

also. Sławomir Biały (talk) 23:35, 14 January 2012 (UTC)[reply]

See Characterizations of the exponential function. Dmcq (talk) 23:48, 14 January 2012 (UTC)[reply]

Once you know the limit exists, all "approaches" must have the same value. You could consider n only a prime, or only irrational numbers, if you like. GromXXVII (talk) 17:49, 15 January 2012 (UTC)[reply]

As said above, if a function converges to a limit, every sequence of values of the function has the same limit (formally, if

\lim _{x\to a}f(x)=L

and

\lim _{n\to \infty }a_{n}=a

then

\lim _{n\to \infty }f(a_{n})=L

. In particular, taking

a=\infty ,\ a_{n}=n

gives

\lim _{n\to \infty }f(n)=\lim _{x\to \infty }f(x)

(if the RHS exists)). This important fact can be easily shown from the definitions.

But this is all kind of moot because in the definition

e=\lim _{n\to \infty }\left(1+{\frac {1}{n}}\right)^{n}

, n is a natural number (you are working with a sequence in the first place), hence the letter n was used. -- Meni Rosenfeld (talk) 09:21, 16 January 2012 (UTC)[reply]

Wikipedia:Reference desk/Archives/Mathematics/2012 January 14

Contents

January 14

Is there a complete axiom system for 3d real vectors without mentioning coordinates?

Responses

final step in proof