Definitions into mathematical objects 9

This is the ninth post in a series, continuing TDMO1TDMO2TDMO3TDMO4, TDMO5TDMO6, TDMO7 and TDMO8.  This series builds up to an explanation of the concept of form in the paper Graph-Based Logic and Sketches by Atish Bagchi and me.  This post tells how to sketch cartesian closed categories and gives an example of a form.

Cartesian closed categories

A Cartesian closed category is a category \mathcal{C} with the following structure:

  • \mathcal{C} has binary products.
  • For each pair of objects A and B of \mathcal{C}, there is an object B^A and an arrow \text{eval}:B^A\times A\to BB^A is an exponential object.
  • For each triple of objects A, B and C of \mathcal{C}, there is a map \lambda:\text{Hom}(B\times A, C)\to \text{Hom}(B,C^A) such that for every arrow f:B\times A\to C, the diagram below commutes:

CCCDiagrams1(lameval)

  • For any arrow g:B\to C^A, \lambda(\text{eval}\circ(g\times\text{Id}[A]))=g.

It follows that \lambda is a natural isomorphism of homsets.

Going from f:B\times A\to C to \lambda f is called currying or partial application. You can curry a multivariable function all the way down to a constant.  The inverse process is usually called uncurrying.   It constitutes getting rid of entries in the codomain of a function that are function spaces by adding variables to the function. I suggest using blanding instead of “uncurrying”.

The FL sketch for cartesian closed categories

The FL sketch for CCC’s (which I will call CCCSk) is spelled out in detail in GBLS. I will mention some of the constructions here.  I will use many of them in the next section to give an example of a form.

Objects and arrows

The arrows needed are spelled out in Sections 7.2 and  7.6 of GBLS.  Here I mention

  • \text{unit}:\text{ob}\to\text{ar} that formally picks out the identity arrow of an object.
  • \text{twovf}, the formal set of functions of two variables, in other words arrows of the form f:B\times A\to C.
  • \text{arrow}:\text{twovf}\to\text{ob}^f that picks out the arrow f itself.
  • \text{tsource}:\text{twovf}\to\text{ob} that picks out the source B\times A of a two-variable function f:B\times A\to C.
  • \text{curry}, the formal set of curried functions from B \to C^A.
  • \text{csource}:\text{curry}\to\text{ob}^B and \text{ctarget}:\text{curry}\to\text{ob}^{C^A} that pick out the source and target of a curried function g:B\to C^A.
  • \text{fs}:\text{ob}^B\times\text{ob}^A\to\text{ob}^{B^A} that picks out the function space B^A of two objects B and A.
  • \text{ar}_2, the formal set of composable pairs of arrows in the FL sketch for categories, defined by this pullback diagram:

CatAr2Def(Ar2Def)

For the purposes of the example below, I need  \text{bsource}:\text{twovf}\to\text{ob}^B\times\text{ob}^A as defined by this diagram:CCCTSourceDef(Bsource)

In an object-oriented program built on this work, this would be a derived method..

One of the limit cones in CCCSk defines \text{curry} and in particular gives \text{fs} the properties it needs:

CCCurryDiagram(Curry)

The commutative diagram below should have been included in the FL sketch for categories in Section 7.2 of GLBS.   It gives the composite of a composable pair the right source and target.

CatCompCompat(Compreq)

Representing reflexive function spaces

An object A of a cartesian closed category is reflexive if there are arrows I:A^A\to A and J:A\to A^A such that

ReflexiveFSRetractDiagram(ret)

commutes.  That means  A^A is a retract of AA^A is called  a reflexive function space. RFS’s  are the theoretical basis for programming languages which have “functions as first class objects”.

So a sketch for RFS’s would be defined by a limit cone over  the diagram below.  I have always shown the cones in blue before, but GBLS does not show cones at all, and it isn’t necessary to show them.

ReflexiveConeBase(ReflFSConeBase)

To specify a reflexive function space you would have to give  an object A and two arrows I:A^A\to A, J:A\to A^A that satisfies Diagram (lameval) above, so the mandatory projections would be to \text{ob}^A, \text{ar}^I and \text{ar}^J.  The projection to \text{ob}^{A^A} is induced by Diagram (Curry) above, the projection to \text{ar}^{<J,I>} by Diagram (Ar2Def), and the projection to \text{ob}^A\times \text{ob}^A follows from the fact that it is the vertex of the product cone shown.  The other projections exist by composition.

Let’s call the vertex of this cone \text{rfs} for “reflexive function space”.  Note that all the objects and arrows in this diagram exist in the FL sketch for cartesian closed categories.

Now you can check (or forget all the constructions above and just Put Your Hands On The Monitor And Believe) that in a model M of the FL sketch for CCC’s, M(\text{rfs}) would be the set of reflexive function spaces together with specified inclusion I:A^A\to A and retraction J:A\to A^A.

Non-surprise: In any model of the FL sketch for CCC’s in the category of sets, M(\text{rfs}) will be the empty set. It is well known that for any set A  the function space A^A has cardinality properly bigger that the cardinality of A, so there can be no inclusion of A^A into A.

Nevertheless, M(\text{rfs}) is a perfectly well defined set.  It just turns out to be empty.

Forms

Forms are defined formally in GBLS.  This series of posts is supposed to give you an intuitive idea or understanding of form.  Well, we can make a form for reflexive function spaces out of the work done above.  Some terminology:

Constructor space

The sketch CCCSk for cartesian closed categories is a constructor space. A constructor space must be an FL-sketch that “contains” and is “generated” (see GBLS, Chapter 6.1) by the FL sketch for categories, and indeed CCCSk is built from the FL sketch for categories by adjoining some objects and arrows, cones and commutative diagrams that all exist already in the FL Cattheory for categories (see TDMO6).  So a constructor space \text{C} has models we can call  C-categories. In particular a model for CCCSk in sets is a cartesian closed category.  In previous posts I have constructed constructor spaces for categories with finite products and categories with finite limits.

Constructor spaces construct doctrines, in the one-dimensional case and with restrictions I will mention in a later post.

For any constructor space \text{C} , take an object F in the FL Cattheory of \text{C}   and adjoin a new arrow f:1\to F where 1 is the terminal object.  f  is what we call a C-form and \text{C} with f as a freely adjoined arrow is called \text{C}_f.

A model of a \text{C}-form “in a C-category \cal{K}“  means that \cal{K} is a model of \text{C}_f  This means that in  \cal{K}, M(F) is nonempty.

The CCC-form for reflexive function spaces is then a global element of the node \text{rfs} of CCCSk as constructed above for some CCC.  There are no models in the category of sets, but there are many toposes (and many other CCC’s) that do have models, such as the realizability topos.

I have lots of miscellaneous comments and explanations to make about the idea of form which must wait for another post.

References

[1] Charles Wells (1990), A generalization of the concept of sketch.

(see TDMO6) (see TDMO6)

Three kinds of mathematical thinkers

This is a continuation of my post Syntactic and semantic thinkers, in which I mentioned Leone Burton’s book [1] but hadn’t read it yet.  Well, now it is due back at the library so I’d better post about it!

I recommend this book for anyone interested in knowing more about how mathematicians think about and learn math.  The book is based on in-depth interviews with seventy mathematicians.  (One in-depth interview is worth a thousand statistical studies.)   On page 53, she writes

At the outset of this study, I had two conjectures with respect to thinking style.  The first was that I would find the two different thinking styles,the visual and the analytic, well recorded in the literature… The second was that research mathematicians would move flexibly between the two.  Neither of these conjectures were confirmed.

What she discovered was three styles of mathematical thinking:

Style A: Visual (or thinking in pictures, often dynamic)

Style B: Analytic (or thinking symbolically, formalistically)

Style C: Conceptual (thinking in ideas, classifying)

Style B corresponds more or less with what was called “syntactic” in [3] (based on [2]).  Styles A and C are rather like the distinctions I made in [3] that I called “conceptual” and “visual”, although I really want Style A to communicate not only “visual” but “geometric”.

I recommend jumping through the book reading the quotes from the interviews.  You get a good picture of the three styles that way.

Visual vs. conceptual

I had thought about this distinction before and have had a hard time explaining what “conceptual” means, particularly since for me it has a visual component.  I mentioned this in [3].  I think about various structures and their relationship by imagining them as each in a different part of a visual field, with the connections as near as I can tell felt rather than seen.  I do not usually think in terms of the structures’ names (see [4]).  It is the position that helps me know what I am thinking about.

When it comes time to write up the work I am doing, I have to come up with names for things and find words to describe the relationships that I was feeling. (See remark (5) below).  Sometimes I have also written things down and come up with names, and if this happened very much I invariable get a clash of notation that didn’t bother me when I was thinking about the concepts because the notations referred to things in different places.

I would be curious if others do math this way.  Especially people better than I am.  (Clue to a reasonable research career:  Hang around people smarter than you.)

Remarks

1) I have written a lot about images and metaphors [5], [6].  They show up in the way I think about things sometimes.  For example, when I am chasing a diagram I am thinking of each successive arrow as doing something.  But I don’t have any sense that I depend a lot on metaphors.  What I depend on is my experience with thinking about the concept!

2) Some of the questions on Math Overflow are of the “how do I think about…” type (or “what is the motivation for…”).  Some of the answers have been Absolutely Entrancing.

3) Some of the respondents in [1] mentioned intuition, most of them saying that they thought of it as an important part of doing math.  I don’t think the book mentioned any correlation between these feelings and the Styles A, B, C, but then I didn’t read the book carefully.  I never read any book carefully. (My experience with Style B of the subtype Logic Rules diss intuition. But not analysts of the sort who estimate errors and so on.)

4) Concerning A, B, C:  I use Style C (conceptual) thinking mostly, but a good bit of Style (B) (analytic) as well.  I think geometrically when I do geometry problems, but my research has never tended in that direction.  Often the analytic part comes after most of the work has been done, when I have to turn the work into a genuine dry-bones proof.

5) As an example of how I have sometimes worked, I remember doing a paper about lifting group automorphisms (see [7]), in which I had a conceptual picture with a conceptual understanding of the calculations of doing one transformation after another which produced an exact sequence in cohomology.  When I wrote it up I thought it would be short.  But all the verifications made the paper much longer.  The paper was conceptually BigChunk BigChunk BigChunk BigChunk … but each BigChunk required a lot of Analytic work.  Even so, I missed a conceptual point (one of the groups involved was a stabilizer but I didn’t notice that.)

References

[1] Leone Burton, Mathematicians as Enquirers: Learning about Learning Mathematics.  Kluwer, 2004.

[2] Keith Weber, Keith Weber, How syntactic reasoners can develop understanding, evaluate conjectures, and generate counterexamples in advanced mathematics. Proof copy available from Science Direct.

[3] Post on this blog: Syntactic and semantic thinkers.

[4] Post: Thinking without words.

[5] Post: Proofs without dry bones.

[6] Abstractmath.org article on Images and Metaphors.

[7] Post: Automorphisms of group extensions updated.

Naive proofs

The monk problem

A monk starts at dawn at the bottom of a mountain and goes up a path to the top, arriving there at dusk. The next morning at dawn he begins to go down the path, arriving at dusk at the place he started from on the previous day. Prove that there is a time of day at which he is at the same place on the path on both days.

Proof: Envision both events occurring on the same day, with a monk starting at the top and another starting at the bottom at the same time and doing the same thing the monk did on different days. They are on the same path, so they must meet each other. The time at which they meet is the time required.

The pons asinorum

Theorem: If a triangle has two equal angles, then it has two equal sides.

Proof: In the figure below, assume angle ABC = angle ACB. Then triangle ABC is congruent to triangle ACB since the sides BC and CB are equal and the adjoining angles are equal.

PATriangle

I considered the monk problem at length in my post Proofs Without Dry Bones.  Proofs like the one given of the pons asinorum, particularly its involvement with labeling, recently came up on the mathedu mailing list.  See also my question on Math Overflow.

Naive proofs

These proofs share a characteristic property; I propose to say they are naive, in the sense Halmos used it in his title Naive Set Theory.

The monk problem proof is naive.

For the monk problem, you can give a model of a known mathematical type (for example model the paths as  smoothly parametrized curves on a surface) and use known theorems (for example the intermediate value theorem) and facts (for example that clock time is cyclical and invariant under the appropriate mapping) to prove it.  But the proof says nothing about that.

You could imagine inventing an original set of axioms for the monk problem, giving axioms for a structure that are satisfied by the monk’s journeys and their timing and that imply the result.  In principle, these could be very different from multivariable calculus ideas and still serve the purpose. (But I have not tried to come up with such a thing.)

But the proof as given simply uses directly  known facts about clock time and traveling on paths.  These are known to most people.  I have claimed in several places that this proof is still a mathematical proof.

Every proof is incomplete in the sense that they provide a mathematical model and analyze it using facts the reader is presumed to know.  Proofs never go all the way to foundations.  A naive proof simply depends more than usual on the reader’s knowledge: the percentage of explication is lower.  Perhaps “naive” should also include the connotation that the requisite knowledge is “common knowledge”.

The pons asinorum proof is naive.

This involves some subtle issues.  When I first wrote about this proof in the Handbook I envisioned the triangle as existing independently of any embedding in the plane, as if in the Platonic world of ideals.  I applied some labels and a relabeling and used a known theorem of Euclid’s geometry.  You certainly don’t have to know where the triangle is in order to understand the proof.

That’s a clue.  The triangle in the problem does not need to be planar. It is true for triangles in the sphere or on a saddle surface, because the proof does not involve the parallel axiom. But the connection with the absence of the parallel axiom is illusory.  When you imagine the triangle in your head the proof works directly for a triangle in any suitable geometry, by imagining the triangle as existing in and of itself, and not embedded in anything.

Questions

  1. How do you give a mathematical definition of a triangle so that it is independent of embedding?  This was the origin of my question on Math Overflow, although I muddled the issue by mentioning specific ways of doing it.
  2. (This is a variant of question 1.)  Is there anything like a classifying topos or space for a generic triangle?  In other words, a category or space or something that is just big enough to include the generic triangle and from which mappings to suitable spaces or categories produce what we usually mean by triangles.
  3. Some of the people on mathedu thought a triangle obviously had to have labels and others thought it obviously didn’t.  Specifically, is triangle ABC “the same” as triangle ACB?  Of course they are congruent.  Are they the sameThis is an evil question. The proof works on the generic isosceles triangle.  That’s enough.  Isn’t it?  All three corners of the generic isosceles triangle are different points.  Aren’t they?  (I have had second, third and nth thoughts about this point.)
  4. You can define a triangle as a list of lengths of edges and connectivity data.  But the generic triangle’s sides ought to be (images of) line segments, not abstract data.  I don’t really understand how to formulate this correctly.

Note

1.  I could avoid discussion of irrelevant side issues in the monk problem by referring to specific times of day for starting and stopping, instead of dawn and dusk.  But they really are irrelevant.

BC and AD

I forget half the time that to be politically correct I should use BCE and CE instead of BC and AD.  I have a better idea.  I decree that from now on, BC means Backward Counting and AD means Advancing Denumeration.  At least in what I write.  Unless I forget.

Characterizing triangles unembeddedly

I just posted this question on mathoverflow (I recommend looking into this new forum):

The mathedu mailing list has a recent longish thread which discusses among other things whether we should teach triangles as labeled or unlabeled to high school students (this is a vast oversimplification of the thread).  I have long been concerned with how we think (informally and formally) about mathematical objects, as for example my unfinished article here about the many ways of thinking about function.  So naturally, I started to consider how we think about triangles.

Consider circles.   Most informal and formal descriptions involve an embedding into R^2, but they *can* be characterized as manifolds (even as Riemannian manifolds) of dimension 1 with specific properties, independent of any embedding. This sort of thing has turned out to be a major way to think about all sorts of spaces.  So can we describe triangles in a similar way?

Unfortunately, manifolds are far removed from my usual mathematical work (category theory).  What I *think* I understand is that there can be *piecewise* linear manifolds, even Riemannian ones.  So perhaps we can say a triangle is a piecewise linear manifold of dimension 1 with certain properties.  Now, I want to define a triangle so that it comes complete with information about the lengths of its sides and what the three angles are.  Riemannian manifolds have a way to specify length and angles, and I can believe you can make the sides have specific lengths.  But the angles?  It seems to me that the tangent spaces (like those on a circle) result in all angles being 0 or pi, except at the corners where they don’t exist.  But I may not understand the situation correctly.

So my question is:  Is there a known methodology that allows triangles to be characterized independent of embeddings in such a way that incorporates information about side lengths and angles?

More about pronouncing “a” and “the”

I eavesdropped on my grandson dictating a story for his granny to type.  He always pronounced “a” as a schwa.  I mentioned in an earlier post that Barak Obama and Hilary Clinton both normally pronounce it “ay” in speeches and wondered if this is a generational change.  The grandsonian evidence suggests not.  But:

  1. Do Obama and Clinton pronounce it that way in ordinary conversation?  I bet not.
  2. Is there a Speech Making School for Politicians that has them do this, or is this the result of their unconsciously adopting a Speech Making Register?  I’ll bet the latter.

The grandson also regularly said “thee” for “the” before vowel sounds and used the schwa before consonants.  This makes me want to go back to You Tube and eavesdrop on politicians some more.

I might add that if he stopped after “the” to let poor old granny catch up with her typing, he used the schwa even thought what came next was a vowel.  However, this may not prove anything since he may not have had the next word in mind.

Years ago, I was fascinated listening to John Kennedy, who pronounced r at the end of a word only if the next word began with a vowel, as in “The far east” but “The fah boundaries of…”.  I thought that it was remarkable that he did this even if he paused before second word.

Definitions into mathematical objects 8

This is the eighth post in a series, continuing TDMO1TDMO2TDMO3TDMO4,TDMO5TDMO6 and TDMO7.  This series builds up to an explanation of the concept of form in the paper Graph-Based Logic and Sketches by Atish Bagchi and me.  This post tells how to sketch categories with finite products.

Categories with finite products

We gave an FL sketch for categories in TDMO7.  The sketch has (among others) objects the formal set (Note 2) of objects \text{ob}, the formal set of arrows \text{ar} and the formal set of composable pairs of arrows \text{ar}_2, as well as arrows

\mathbf{source}, \mathbf{target}:\mathbf{ar}\longrightarrow\mathbf{ob} that formally pick the source and target of an arrow and

\mathbf{comp}:\mathbf{ar_2}\longrightarrow\mathbf{ar} that picks out the composite of a pair of arrows.

A category has finite products (is an FP category) if it has terminal objects and binary products.  GBLS describes the construction of the FL sketch for categories with finite products in section 7.3, which spells out how you sketch the set of terminal objects and the set of binary product cones.  I will give a more detailed discussion of how to do binary product cones here and introduce the annotation notation we use.

Syntax and semantics

We must distinguish between the FL sketch for FP categories and its models, which are FP categories.  A diagram in the FL sketch will be in black (the color for syntax) and a diagram in a category with finite products will be green (the color for semantics).  In both syntax and semantics, projection arrows may be blue and arrows that exist uniquely (fill-in arrows) may be red.

Product diagrams

A product diagram in an FP category looks like this:

ProductCone(Diagram \Lambda)

But a diagram can look like that without being a product diagram. You have to say that it is a product diagram.  (Note [1].)  This means that for any cone to the same pair of objects

AnyCone

(Diagram \Gamma)

there is a unique arrow h making this diagram commute:

FiaDiagram

(Diagram F)

FL sketch for FP categories

To get the FL sketch for FP categories, you start with the sketch for categories and add some objects and arrows.  The ones listed below are the objects and arrows needed for sketching the set of binary cones and finite-product cones.  Others are needed for the terminal object (much easier).

Objects

\text{cone}, the formal set of cones of the form \Lambda.

\text{fid}, the formal set of fill-in diagrams (diagrams of the form of F above).

Arrows

\text{prod}:\text{ob}\times\text{ob}\rightarrow\text{cone}, that picks out the product cone over a pair of objects.

\text{soco}:\text{fid}\rightarrow\text{cone}, that picks out the source cone of a fill-in arrow.

\text{taco}:\text{fid}\rightarrow\text{cone}, that picks out the target cone of a fill-in arrow.

\text{ufid}:\text{cone}\rightarrow\text{fid}, that takes a cone to the unique fill-in diagram that has the cone as source cone.

\text{fia}:\text{fid}\rightarrow\text{ar} that formally picks out the fill-in arrow in a fill-in diagram.

Specification for the formal set of cones

The FL sketch for FP categories contains this limit cone, which requires that the two arrows in a cone have the same source. Note that I have named the projections lproj and rproj.

ConeDef

(Diagram ConeSpec)

Annotations

This cone is annotated according to a system that is spelled out precisely in GBLS.  The annotation of this diagram refers to cone \Gamma above.  Some perspectives on this situation:

  1. The annotations allow you to chase the diagram.  For example, if \Gamma is a member of the set of cones of a model, then \text{lproj}(\Gamma)=q_1 and \text{source}(q_1)=\text{V}.
  2. \Gamma is being used as the shape graph (GBLS, 2.3) for ConeSpec.
  3. The annotation is a section of the model thought of as a sheaf.

Specification for the formal set of fill-in diagrams

This much more complicated annotated limit cone defines \text{fid}:

fidcone

I completed the annotation compared to the version in GBLS, whose partial annotation generates the rest.  The three projection arrows shown generate all the others, not shown to avoid clutter.

Some notes and examples:

  1. It is worthwhile checking through the conditions to see that this limit cone really does specify fid.  If you do, you will see the sketch needs a couple of commutative diagrams, found in GBLS, 7.3.2.
  2. The whole diagram is commutative because of the diagrams just mentioned and the commutative diagrams of the FL sketch for categories mentioned in TDMO7 (and described completely in GBLS 7.2).
  3. This diagram is symmetrical in spirit, with one exception: The arrow prod breaks the symmetry because there is no corresponding arrow going upward.  This reflects the fact that a fill-in diagram is not symmetrical: The right hand cone is a limit cone but the left hand cone is not.
  4. I suspect that the diagram could be drawn in a way that is really symmetrical (geometrically) in three dimensions (or four?), except for the break mentioned in (3).

What’s next

We will look more briefly at the FL sketch for cartesian categories, and then it is time, finally, to define form.

Notes

1. When students start taking college math, they soon discover that they have to read the surrounding text to understand what a symbolic expression means.  The expression is no longer self-sufficient.  (The \times symbol can mean multiplication of numbers, of matrices, or cross product of vectors.) When I first came across this aspect of mathematics in a matrix theory course at Texas Southmost College, I felt that I had been ejected from paradise.

2. It is bothersome to refer to the formal sets of objects and of arrows, because the FL sketch for categories, as well as the FL sketch for FP categories I discuss in this post, has models in many other FL categories.  \text{ob} could be called the formal object of objects, and so on.  In fact I could call it the formal object of objects, since green is for semantics. I expect to combine these posts into an expository paper eventually and perhaps I will carry out that practice there. However, I foresee complications which I must contemplate first. (Compare the device I suggested in this post on the comma rule).

References

  1. Atish Bagchi and Charles Wells, Graph-Based Logic and Sketches, draft, September 2008, on ArXiv.
  2. Michael Barr, Models of sketches (1986). Cahiers de Topologie et Géométrie Différentielle Catégorique, 27:93-107.
  3. Michael Barr and Charles Wells. On the limitations of sketches (1992). Canadian Mathematical Bulletin, 35:287-294.
  4. Michael Barr and Charles Wells, Category Theory for Computing Science (1999). Les Publications CRM, Montreal (publication PM023).
  5. Michael Barr and Charles Wells, Toposes, Triples and Theories (2005). Reprints in Theory and Applications of Categories 1.
  6. Peter T. Johnstone, Sketches of an Elephant: A Topos Theory Compendium, Volume 2 (2002). Oxford Logic Guides 44, Oxford University Press, ISBN 978-0198524960.
  7. Yoshiki Kinoshita, John Power, and Makoto Takeyama. Sketches (1997). In Mathematical Foundations of Programming Semantics, Thirteenth Annual Conference, Stephen Brookes and Michael W. Mislove, editors. Elsevier.
  8. A.J. Power and Charles Wells. A formalism for the speci fication of essentially algebraic structures in 2-categories (1992) Mathematical Structures in Computer Science,2:1-28.
  9. Charles Wells. A generalization of the concept of sketch (1990). Theoretical Computer Science, 70:159-178.

Syntactic and semantic thinkers

A paper by Keith Weber

Reidar Mosvold’s math-ed blog recently provided a link to an article by Keith Weber (Reference [2]) about a very good university math student he referred to as a “syntactic reasoner”.  He interviewed the student in depth as the student worked on some proofs suitable to his level.  The student would “write the proofs out in quantifiers” and reason based on previous steps of the proof in a syntactic way rather than than depending on an intuitive understanding of the problem, as many of us do (the author calls us semantic reasoners).  The student didn’t think about specific examples –  he always tried to make them as abstract as possible while letting them remain examples (or counterexamples).

I recommend this paper if you are at all interested in math education at the university math major level — it is fascinating.  It made all sorts of connections for me with other ideas about how we think about math that I have thought about for years and which appear in the Understanding Math part of abstractmath.org.  It also raises lots of new (to me) questions.

Weber’s paper talks mostly about how the student comes up with a proof.  I suspect that the distinction between syntactic reasoners and semantic reasoners can be seen in other aspects of mathematical behavior, too, in trying to understand and explain math concepts.  Some thoughts:

Other behaviors of syntactic reasoners (maybe)

1) Many mathematicians (and good math students) explain math using conceptual and geometric images and metaphors, as described in Images and metaphors in abstractmath.org.   Some people I think of as syntactic reasoners seem to avoid such things. Some of them even deny thinking in images and metaphors, as I discussed in the post Thinking without words.   It used to be that even semantic reasoners were embarassed to used images and metaphors when lecturing (see the post How “math is logic” ruined math for a generation).

2) In my experience, syntactic reasoners like to use first order symbolic notation, for example eq0001MP

and will often translate a complicated sentence in ordinary mathematical English into this notation so they can understand it better.  (Weber describes the student he interviewed as doing this.)  Furthermore they seem to think that putting a formula such as the one above on the board says it all, so they don’t need to draw pictures, wave their hands [Note 1], and so on.  When you come up with a picture of a concept or theorem that you claim explains it their first impulse is to say it out in words that generally can be translated very easily into first order symbolism, and say that is what is going on.  It is a matter of what is primary.

The semantic reasoners of students and (I think) many mathematicians find the symbolic notation difficult to parse and would rather have it written out in English.  I am pretty good at reading such symbolic notation [Note 2] but I still prefer ordinary English.

3) I suspect the syntactic reasoners also prefer to read proofs step by step, as I described in my post Grasshoppers and linear proofs, rather than skipping around like a grasshopper.

And maybe not

Now it may very well be that syntactic thinkers do not all do all those things I mentioned in (1)-(3).  Perhaps the group is not cohesive in all those ways.  Probably really good mathematicians use both techniques, although Weyl didn’t think so (quoted in Weber’s paper).   I think of myself as an image and metaphor person but I do use syntax, and sometimes even find that a certain syntactic explanation feels like a genuinely useful insight, as in the example I discussed under conceptual in the Handbook.

Distinctions among semantic thinkers

Semantic thinkers differ among themselves.  One demarcation line is between those who use a lot of visual thinking and those who use conceptual thinking which is not necessarily visual.  I have known grad students who couldn’t understand how I could do group theory (that was in a Former Life, before category theory) because how could you “see” what was happening?  But the way I think about groups is certainly conceptual, not syntactic.  When I think of a group acting on a space I think of it as stirring the space around.  But the stirring is something I feel more than I see.  On the other hand, when I am thinking about the relationships between certain abstract objects, I “see” the different objects in different parts of an interior visual space.  For example, group is on the right, stirring the space-acted-upon on the left, or the group is in one place, a subgroup is in another place while simultaneously being inside the group, and the cosets are grouped (sorry) together in a third place, being (guess what) stirred around by the group acting by conjugation (Note [3]).

This distinction between conceptual and visual, perhaps I should say visual-conceptual and non-visual-conceptual, both opposed to linguistic or syntactic reasoning, may or may not be as fundamental as syntactic vs semantic.   But it feels fundamental to me.

Weber’s paper mentions an intriguing sounding book (Reference [1]) by Burton which describes a three-way distinction called conceptual, visual and symbolic, that sounds like it might be the distinction I am discussing here.  I have asked for it on ILL.

Notes

  1. Handwaving is now called kinesthetic communication.  Just to keep you au courant.
  2. I took Joe Shoenfield’s course in logic when his book  Mathematical Logic [3] was still purple.
  3. Clockwise for left action, counterclockwise for right action.  Not.

References

  1. Leone L. Burton, Mathematicians as Enquirers: Learning about Learning Mathematics.  Springer, 2004.
  2. Keith Weber, How syntactic reasoners can develop understanding, evaluate conjectures, and generate counterexamples in advanced mathematics. Proof copy available from Science Direct.
  3. Joseph Shoenfield, Mathematical logic, Addison-Wesley 1967, reprinted 2001 by the Association for Symbolic Logic.

Definitions into mathematical objects 7

This is the seventh post in a series, continuing TDMO1, TDMO2, TDMO3, TDMO4, TDMO5 and TDMO6.  This series builds up to an explanation of the concept of form in the paper Graph-Based Logic and Sketches by Atish Bagchi and me.  This post discusses the sketch for categories.

The sketch for categories

FL sketches (finite-limit sketches) makes it possible to construct a sketch for categories. To do this you need some nodes:

\mathbf{ob}, which in a model will become the set of objects –  so in our “formal” terminology  is the formal set of objects.  (This terminology, by the way, is due to John W. Gray).

\mathbf{ar}, the formal set of arrows.

\mathbf{ar}_2, the formal set of composable pairs of arrows.

\mathbf{ar}_3, the formal set of composable triples of arrows, needed to state the associative law.

It needs many arrows, some of which are:

\mathbf{unit}:\mathbf{ob}\longrightarrow\mathbf{ar} that formally picks the identity arrow of an object

\mathbf{source}, \mathbf{target}:\mathbf{ar}\longrightarrow\mathbf{ob} that formally pick the source and target of an arrow.

\mathbf{comp}:\mathbf{ar_2}\longrightarrow\mathbf{ar} that picks out the composite of a pair of arrows.

…as well as many structural arrows that, for example, pick out the first arrow in a composable chain, or that picks out the pair of arrows <h, g\circ f> given a composable triple.

Cones and diagrams

Of course, you can’t just say that \mathbf{ar}_2 is the formal set of composable pairs of arrows.   What you must do is produce a cone that forces it to be that formal set.  And here it is:

CompositeCone

(The blue arrows are the projections, the black objects and arrows form the base diagram.  Note that the middle blue arrow is superfluous, as mentioned in TDMO5.  If you drop it, you may recognize this as an ordinary pullback diagram.)

You need an analogous cone for M(\mathbf{ar_3}) and diagrams for associativity and to make the identity arrows behave right.  The details are in GLBS (reference [1]), chapter 7.

In a model M, the elements of M(\mathbf{ar_2}) are diagrams that look like this:

CompositeChain2

This green diagram is in the model (semantics), and the cone above is a diagram in the syntax. You have to make the distinction constantly in this subject. This is remarkably annoying. To help, I am systematically putting objects of the category-that-is-the-model in green, and cone projections in blueBlack and blue means syntax,  green means semantics.

Remarks

M(\mathbf{ar_2}) is the domain of the composition arrow comp.  So defining categories requires more subtlety that defining an ordinary algebra, where the domains of the ops are always cartesian products:  this domain is an equalizer.  To sketch categories requires the full power of finite limits, not just finite products. Of course, that last sentence does not follow from anything I have said, since there might be a rascally way to use only finite products.  But there isn’t: the category of small categories is not regular (Exercise 6, page 278, of [3])  but any category of multisorted universal algebras (which are the same thing as models of FP theories) does have that property: see [2], section 8.4.

Nevertheless, the category of models of an FL sketch always has all sortwise (see note 1) limits and all sortwise filtered colimits. in particular initial algebras.  We will use that fact.

Categories with structure

Most categories with particular properties can also be sketched with FL sketches.  These include

  • Categories with finite limits
  • Cartesian closed categories
  • Toposes
  • Symmetric monoidal categories
  • Abelian categories

That’s for the next post.

Note

1.  Constructing a particular kind of limit or colimit  “sortwise” means that it is constructed sort by sort.  For example, if always M(S)\times M(T) = M(S\times T), then the category of models has sortwise products.  In most of the references the name “pointwise” is used.

References

  1. Atish Bagchi and Charles Wells, Graph-Based Logic and Sketches, draft, September 2008, on ArXiv.
  2. Michael Barr and Charles Wells, Toposes, Triples and Theories (2005).  Reprints in Theory and Applications of Categories 1.
  3. Michael Barr and Charles Wells, Category Theory for Computing Science (1999).  Les Publications CRM, Montreal (publication PM023).
  4. Michael Barr, Models of sketches (1986).  Cahiers de Topologie et Géométrie Différentielle Catégorique, 27:93-107.

Definitions into mathematical objects 6

This is the sixth post in a series, continuing TDMO1, TDMO2, TDMO3, TDMO4 and TDMO5.  This series builds up to an explanation of the concept of form in the paper Graph-Based Logic and Sketches by Atish Bagchi and me.  This post discusses the cattheories of finite-product sketches and finite-limit sketches (FP sketches and FL sketches).

Background

An FL sketch (finite-limit sketch) is a finite graph\mathcal{G} with a finite set \mathcal{D} of finite formally commutative diagrams and a finite set \mathcal{C} of formal limit cones over finite diagrams (which are not usually among the formally commutative diagrams (Note 1).)

A model of an FL sketch in an FL categoryC is a digraph map into the (underlying graph of) the FL category that take the diagrams in\mathcal{D} to commutative cones and the cones in\mathcal{C}  to limit cones.

The category of models of the FL sketch \mathcal{S} in an FL category \mathcal{C} is denoted by \mathbf{Mod}\left( \mathcal{S},\mathcal{C} \right).

FL and FP cattheories

The FL cattheory of a linear sketch\mathcal{S} is a category\text{CatTh}(\text{FL}, \mathcal{S}) together with a  model\text{G}:\mathcal{S}\to\text{CatTh}(\text{FL}, \mathcal{S}) with the the property that for any model M of \mathcal{S} in a category C, there is a unique model M' of \text{CatTh}(\text{FL}, \mathcal{S}) in C such that

GM.

commutes.  G is the generic model of \mathcal{S}.

The category of models of the cattheory \text{CatTh}(\text{FL},\mathcal{S}) in \mathcal{C} is denoted by \mathbf{Mod}\left( \text{CatTh}(\text{FL},\mathcal{S}),\mathcal{C} \right).

The cattheory has this universal property:

Theorem The map M\mapsto GM:\mathbf{Mod}\left( \mathcal{S},\mathcal{C} \right)\to \mathbf{Mod}\left( \text{CatTh}(\text{FL},\mathcal{S}),\mathcal{C} \right)

is an equivalence of categories.

(Compare the corresponding theorem in TDMO3 for linear categories.)

This theorem forces the cattheory to be determined up to natural equivalence of categories that commutes with the generic model.  For all practical purposes, a model of the sketch is thus the same thing as a model of its cattheory.

An FP sketch\mathcal{S} and its FP cattheory\text{CatTh}(\text{FP}, \mathcal{S}) are defined in the same way, except that the finite diagrams that the cones are over have to be discrete, and it has properties analogous to those for FL sketches given above.  Note that any FP sketch or  linear sketch will also have an FL cattheory.

Constructing FL cattheories

The linear cattheory of a linear sketch is just the free category generated by the graph of the sketch, with the formally commutative diagrams forced to be commutative.  It is more complicated to prove the existence of the FL cattheory of an FL sketch.  Once you do show it exists, it follows easily from the definition that it is uniquely determined up to equivalence of categories.

There are two approaches to showing the existence of the FL cattheory.

As a fixed point

Introduce an operator on categories that adjoins diagrams that in effect add limit cones to finite diagrams that don’t already have limits.  This is done in lots of detail in sections 4.2 and 4.3 of GLBS (Reference [1]).  Starting with an FL-sketch\mathcal{S}, the cattheory\text{CatTh}(\text{FL}, \mathcal{S}) is the minimum fixpoint of this operator.

This is the computer-sciencey way of doing it.  Each arrow and commutative diagram in the theory is constructed explicitly. When GLBS constructs proofs, this step-by-step construction corresponds to the inductive construction of formulas and rules of inference in classical string-based logic.

Embedded in a functor category

Given an FL sketch \mathcal{S}, the category\mathbf{Mod}\left( \mathcal{S},\text{Set} \right) of models of \mathcal{S} in the category of sets turns out to be a full reflective subcategory of the presheaf category \text{Set}^\mathcal{S}.

Let’s spell out what this means: the models are finite-limit-preserving functors and the presheaf category contains all functors.  The reflectivity means that the embedding has a left adjoint, which implies that a limit of a diagram in the model category is also the limit of the diagram in the presheaf category.  The fullness means that every natural transformation between models is a morphism of models.

Using these facts you can get an embedding of \mathcal{S}^{op} in \text{Set}^{\mathcal{S}}; \text{CatTh}(\text{FL}, \mathcal{S}) is then the full FL subcategory generated by the image of the embedding.

This is worked out in detail for the FP case in [3], Chapter 4.3.  The construction is then carefully described for the FL case  in Chapter 4.4, but the proof is omitted.  It is quite analogous to the FP case.

I have been focusing here on the FL case because that is the foundation of the construction of forms in GBLS.

Notes

1.  It is a nice exercise to show that you can eliminate all the commutative subdiagrams from the base diagram of a limit cone and get the same limit.

References

  1. Atish Bagchi and Charles Wells, Graph-Based Logic and Sketches, draft, September 2008, on ArXiv.
  2. Michael Barr and Charles Wells, Category Theory for Computing Science (1999).  Les Publications CRM, Montreal (publication PM023).
  3. Michael Barr and Charles Wells, Toposes, Triples and Theories (2005).  Reprints in Theory and Applications of Categories 1.

Goodnight, Irene

Look at this list:

Antigone
Aphrodite
Chloe
Hermione
Irene
Kalliope
Nike
Penelope
Phoebe
Zoe

All these are originally Greek names of supernatural beings (except Antigone?). The e is a feminine ending. Most of them are used now as women’s names. When Americans pronounce these names, with one exception they usually pronounce the final e.

The exception is “Irene”. I have heard British people say “I-reenie” but never an American. Is this because of “Good Night Irene”?

At one point when I was maybe eleven years old I bought a 45 of the Weavers singing Good Night Irene. It was my favorite song. The record had Tsena Tsena on the other side. I fell in love with Tsena Tsena which I had never heard before, but I still liked GNI too. For some time after that I looked for other records by the Weavers but I never saw one. Perhaps that was about the time the McCarthyites blacklisted them?

I was also attracted by the harmonies of some pieces by Bach. Now I think that the thing TT and Bach (and others of my favorite music, like some Procol Harum) have in common is the existence of both major and minor chords in the same piece. But when I asked my music teacher what was so wonderful about Bach she said she had never understood Bach.

Oh well.

Addenda to the 1993 Sketches paper

I have uploaded here a version of my 1993 sketches paper with an addendum listing a few relevant papers written since then.  I have not kept up with the field well enough to contemplate a complete revision of the 1993 paper.

I recommend that more people update their papers this way.  I did it by making a new PDF file with the added references and then using Acrobat to combine it with the old paper into one file.  That way I didn’t have to re-TeX the old paper, which is a good thing, since I don’t know where some of the .sty files are.

Mastering a proof

In response to Grasshoppers and linear proofs, Avery Andrews said:

Maybe a related question is how much time people do/ought spend on really mastering the proofs of theorems in textbooks, ‘mastering’ being, say, able to explain it in any desired amount of detail at least 2 weeks after last looking at it.

There are two different goals:

  1. Mastering the proof of a theorem in a textbook so that you can explain it in any desired amount of detail…
  2. Mastering a proof of the theorem so that you can explain it in any desired amount of detail…

My observation is that most research mathematicians don’t attempt (1); they are satisfied with (2).  Trying to understand a written proof in detail can be quite difficult:

  • The author may use misleading language.
  • The author may jump over a piece of reasoning that to them is obvious but not to you.
  • The author may mention a previous step or a theorem that justifies the current step, but get the reference wrong.

And so on.

In my observation the typical mathematician will look at the proof, perhaps getting some idea of the overall strategy of the whole proof or a particular part, and then think about it independently until they come up with a proof or part of it.  This may or may not be what the author had in mind.  But by thinking through it the reader will solidify their understanding of the proof in a way that reading and rereading step by step is unlikely to do.

When you construct your knowledge like that you are likely to have it in a permanent, well semi-permanent, way.

Steven Brust commits a zeugma

“…she immediately spurred her horse, yet the horse had hardly moved when Wadre moved his arm in an indication that she was not to advance beyond him, wherefore she drew rein, her sword, and the conclusion that the time was not yet quite at hand to charge.”

Steven Brust, The paths of the dead, p. 335.   New York: Tom Doherty Associates (2002), p. 335.

Composites of functions

In my post on automatic spelling reform, I mentioned the various attempts at spelling reform that have resulted in both the old and new systems being used, which only makes things worse.  This happens in Christian denominations, too.  Someone (Martin Luther, John Wesley) tries to reform things; result: two denominations.   But a lot of the time the reform effort simply disappears.  The Chicago Tribune tried for years to get us to write “thru” and “tho” –  and failed.  Nynorsk (really a language reform rather than a spelling reform) is down to 18% of the population and the result of allowing Nynorsk forms to be used in the standard language have mostly been nil.  (See Note 1.)

In my early years as a mathematician I wrote a bunch of papers writing functions on the right (including the one mentioned in the last post).  I was inspired by some algebraists and particularly by Beck’s Thesis (available online via TAC), which I thought was exceptionally well-written.  This makes function composition read left to right and makes the pronunciation of commutative diagrams get along with notation, so when you see the diagram below you naturally write h = fg instead of h = gf. Composite

Sadly, I gave all that up before 1980 (I just looked at some of my old papers to check).  People kept complaining.  I even completely rewrote one long paper (Reference [3]) changing from right hand to left hand (just like Samoa).  I did this in Zürich when I had the gout, and I was happy to do it because it was very complicated and I had a chance to check for errors.

Well, I adapted.  I have learned to read the arrows backward (g then f in the diagram above).  Some French category theorists write the diagram backward, thus:

CompositeOp

But I was co-authoring books on category theory in those days and didn’t think people would accept it. Not to mention Mike Barr (not that he is not a people, oh, never mind).

Nevertheless, we should have gone the other way.  We should have adopted the Dvorak keyboard and Betamax, too.

Notes

[1] A lifelong Norwegian friend of ours said that when her children say “boka” instead of “boken” it sound like hillbilly talk does to Americans.  I kind of regretted this, since I grew up in north Georgia and have been a kind of hillbilly-wannabe (mostly because of the music); I don’t share that negative reaction to hillbillies.  On the other hand, you can fageddabout “ho” for “hun”.

References

[1] Charles Wells, Automorphisms of group extensions, Trans. Amer. Math. Soc, 155 (1970), 189-194.

[2] John Martino and Stewart Priddy, Group extensions and automorphism group rings. Homology, Homotopy and Applications 5 (2003), 53-70.

[3] Charles Wells, Wreath product decomposition of categories 1, Acta Sci. Math. Szeged 52 (1988), 307 – 319.

“Automorphisms of group extensions” augmented

There has recently been an uptick in citations to my paper [1].  Several works over the years ([2], [3], [4]) have given proofs of my theorem that are easier to understand and more informative, so I have posted a package here that contains the original paper, a correction I published later, and the references below.  Malfait’s article in particular embeds my exact sequence into a remarkable cube of exact sequences.

[1] Charles Wells, Automorphisms of group extensions, Trans. Amer. Math. Soc, 155 (1970), 189-194.

[2] Kung Wei Yang, Isomorphisms of group extensions.  Pacific J. Math. Volume 50, Number 1 (1974), 299-304.

[3] D.J.S. Robinson, Applications of cohomology to the theory of groups, Groups – St. Andrews 1981, London Math. Soc. Lecture Notes vol. 71 (1982), pp. 46–80.

[4] Wim Malfait, The (outer) automorphism group of a group extension.   Bull. Belg. Math Soc. 9 (2002), 361-372.

Grasshoppers and linear proofs

Below, I give an detailed example of how the context of a proof changes as you read the proof line by line. This example comes from the abstractmath article on context.  I mean something like verbal context or  context in the computer science sense (see also Reference [1]): the values of all the relevant variables as specified up to the current statement in the proof.  For example, if the proof says “Suppose x = 3″, then when you read succeeding statements you know that x has the value 3, as long as it is not changed in some later statement.

Here is the text I will analyze:

Definition: Divides

Let m and n be integers with m\ne 0. The statement “m divides n” means that there is an integer q for which n=qm.

Theorem

Let m, n and p be integers, with m and n nonzero, and suppose m divides n and n divides p .  Then m divides p.

Proof

By definition of divides, there are integers q and q’ for which n=qm and p=q'n. We must prove that there is an integer q'' for which p=q''n. But p=q'n=q'qm, so let q''=q'q.  Then p=q''n.

0) Definition: Divides Changes the status of the word “divides” so that it becomes the definiendum. The scope is the following paragraph.
1) Let m and n be integers m and n are new symbols in this discourse, constrained to be integers
2) with m\neq 0 another constraint on m
3) The statement “m divides n” means that This sentence fragment gives the rest of the sentence (in the box below it) a special status.
4) there is an integer q for which n = qm. This clause introduces q, another new symbol constrained to be an integer.  The clause imposes a restraint on m, n and q, that they satisfy the equation n = qm. But we know this only in the scope of the word Definition, which ends at the end of the sentence.  Once we read the word Theorem we no longer know that q exists, much less that it satisfies the constraint.  Indeed, the statement of the definition means that one way to prove the theorem is to find an integer q for which n = qm. This is not stated explicitly, and indeed the reader would be wrong to draw the conclusion that in what follows the theorem will be proved in this way. (In fact it will in this example, but the author could have done some other kind of proof. )
5) Theorem The placement of the word “Theorem” here announces that the next paragraph is a mathematical statement and that the statement has been proved.  In real time the statement was proved long before this discourse was written, but in terms of reading the text in order, it has not yet been proved.
6) Let m, n and p be integers, We are starting a new context, in which we know that m, n and p are all  integers.  This changes that status of m and n, which were variables used in the preceding paragraph, but now all previous constraints are discarded. We are starting over with m, n, and p.  We are also starting what the reader must recognize as the hypotheses of a conditional sentence, since that affects the context in a very precise way.
7) with m and n nonzero. Now m and n are nonzero.  Note that in the previous paragraph n was not constrained to be nonzero.  Between the words “Let” and “with” in the current sentence, neither were constrained to be nonzero.
8 ) and n divides p More new constraints:  m divides n and n divides p.
9) Then m divides p.   The word “then” signals that we are starting the conclusion of the conditional sentence.  It makes a claim that m divides p whenever the conditions in the hypothesis are true.  Because it is the conclusion, it has a different status from the assumptions that m divides n and n divides p.   We can’t treat m as if it divides p even though this sentence says it does.  All we know is that the author is claiming that m divides p if the hypotheses are true, and we expect (because the next word is “Proof”) that this claim will shortly be proved.
10) Proof

This starts a new paragraph.  It does not necessarily wipe out the context.  If the proof is going to be by the direct method (assume hypothesis, prove conclusion) — as it does — then it will still be true that m and n are nonzero integers,  m divides n and n divides p.
11) By definition of divides, there are integers q and qfor which n = qm and p = q’n .

Since this proof starts by stating the hypothesis of the definition of “divides”, we now know that we are using the direct method, and that q and q’ are new symbols that we are to assume satisfy the equations  n = qm and p = q’n.   The phrase “by definition of divides” tells us (because the definition was given previously) that there are such integers, so in effect this sentence chooses q and qso that  n = qm and p = q’n.  The reader probably knows that there is only one choice for each of q and q′ but in fact that claim is not being made here.  Note that m, n and p are not new symbols – they still fall within the scope of the previous paragraph, so we still know that  m divides n and n divides p. If the proof were by contradiction, we would not know that.
12) We must prove that there is an integer  q” for which p = q”n q’’ is introduced by this sentence and is constrained by the equation. The scope of this sentence is just this sentence. The existence of  q’’ and the constraint on it do not exist in the context after the sentence is finished.  However, the constraints previously imposed on m, n, p, q and q’ do continue.
13) But  p = q’n = q’qm This is a claim about p, q, q′, m and n.  The equations are justified by certain preceding sentences but this justification is not made explicit.
14) so let q” = q’q We are establishing a new variable q″ in the context.   Now we put another constraint on it, namely q” = q’q.  It is significant that a variable named q″ was introduced once before, in the reference to the definition of divides.  A convention of mathematical discourse tells you to expect the author to establish that it fits the requirement of the definition. This condition is triggered by using the same symbol q″ both here and in the definition.
15) Then p = q”n This is an assertion about p, q″ and n, justified (but not explicitly) by the claim that p = q’n = q’qm.
16) The proof is now complete, although no statement asserts that.

I have several comments to make about this kind of analysis that are (mostly) not included in the abstractmath article.

a) This is supposed to be what goes through an experienced mathematician’s head while they are reading the proof.  Mostly subconsciously.  Linguists (as in Reference [1]) seem to think something like this takes place in your mind when you read any text, but it gets much denser in mathematical text.  Computer scientists analyze the operation of subprograms in this way, too.

b) Comment (a) is probably off the mark.  With a short proof like that, I get a global picture of the proof as my eyes dart back and forth over the various statements in the proof.  Now, I am a grasshopper: I read math stuff by jumping back and forth trying to understand the structure of the argument.  I do this both locally in a short proof and also globally when reading a long article or book:  I page through to find the topic I want and then jump back and forth finding the meanings of words and phrases I don’t understand.

c) I think most mathematicians are either grasshoppers or they are not good readers and they simply do not learn math by reading text.  I would like feedback on this.

d) If (a) is incorrect, should I omit this example from abstractmath?  I don’t think so.  My experience in teaching tells me that

  1. some students think this is perfectly obvious and why would I spend time constructing the example?,
  2. others are not aware that this is going on in their head and they are amazed to realize that it is really happening,
  3. and still others do not understand how to read proofs and when you tell them this sort of thing goes on in your head they are terminally intimidated.  (“Terminally” in the sense that they dye their hair black and become sociology majors.  They really do.)  Is that bad?  Well, I don’t think so.  I would like to hear arguments on the other side.

e) Can you figure out why item 8 of the analysis is labeled as “8 )” instead of “8)”?

Time is running out. I have other comments to make which must wait for a later post.

References

G. Chierchia and S. McConnell-Ginet (1990), Meaning and Grammar. The MIT Press.

Definitions into mathematical objects 5

This is the fifth post in a series, continuing TDMO1, TDMO2, TDMO3 and TDMO4.  This series builds up to an explanation of the concept of form in the paper Graph-Based Logic and Sketches by Atish Bagchi and me.

Note So far I have described linear sketches and FP-sketches, and I described the cattheory of a linear sketch.  FP-sketches have cattheories, too.  They will be described together with the cattheories of FL-sketches (defined below) in a later post.

Finite Limits

A cone to a set \cal{S} of objects of a category consists of an object v of the category and one arrow from v to each object in \cal{S}.  A finite-product diagram is a cone to such a (finite) set with the unique fill-in property I explained in TDMO4.

Now suppose we have a finite diagram \cal{D} in a category.  I am specifically not assuming it is commutative.  A commutative cone to \cal{D} is an object v, an arrow(called a projection) from v to each node of \cal{D}, with the additional property that each triangle formed by two projections and an arrow of the diagram must commute.  The diagram is called the base diagram of the cone.

Here is an example base:PullbackBase(a)

A commutative cone over this base will look like the left diagram below.Pullbacks

Since the triangles involving two projections have to commute, the diagonal project is determined by the other two.  For this reason, this particular example is almost always drawn as on the right.

Two other examples:  In each case all the projections are shown on the left and only the necessary ones on the right.

Equalizers(b)LimitCones(c)

A limit cone over a diagram is a commutative cone over the diagram with the same unique-fill-in property that product cones have.  For example if this is a limit cone over diagram (a) above

PullbackDiagram(d)

and the blue arrows on the left below also form a commutative cone over the same base, then there must be a unique fill-in arrow making everything commute in the diagram on the right.PullbackSawHorse

This particular shape (d) of limit cone is called a pullback diagram. A limit cone in the shape of (b) is an equalizer diagram. It turns out that if you assume that if all pullbacks and equalizers exist, then you have limit cones over every finite diagram for free.  Limit cones of shape (c) don’t have a common name but they will be referred to in a later post.

Since a finite set of objects is just a finite diagram with no arrows, product diagrams are limit diagrams.  The commutativity requirement is then vacuous.

A category has all finite limits is called an FL-category. An older name, which, as computer scientists say, should be deprecated, is a left exact category. From the remark in the previous paragraph, an FL-category is automatically an FP-category.

FL-Sketches

An FL-sketch is a digraph together with some  specified finite formally commutative diagrams and some specified finite formal limit diagrams.  A model of an FL-sketch in a category \mathcal{C} is a functor from the digraph to \mathcal{C} that takes the specified formally commutative diagrams to commutative diagrams and the specified formal limit diagrams to limit diagrams.

The first key to understanding GBLS is that all sorts of interesting kinds of categories (cartesian closed, toposes, symmetric monoidal) are models of FL-sketches.  Indeed, I betcha (lived in Minnesota a whole year) every kind of n-category that anyone ever defined is a model of an FL-sketch — it doesn’t take anything that might be called “n-sketches”.  However, they will have the same problem as ordinary categories in that in the category of models the structure has to be preserved on the nose.  In the case of FL-sketches for different kinds of 1-categories, the sketch’s model category is equivalent to the usual way we define categories of that kind of category.  What “equivalence” means for “every kind of n-category” is referred to in Reference 4.

References

  1. Atish Bagchi and Charles Wells, Graph-Based Logic and Sketches, draft, September 2008, on ArXiv.
  2. Michael Barr and Charles Wells, Category Theory for Computing Science (1999).  Les Publications CRM, Montreal (publication PM023).
  3. Michael Barr and Charles Wells, Toposes, Triples and Theories (2005).  Reprints in Theory and Applications of Categories 1.
  4. n-Labs, Equivalence of categories.

The revolution in technical exposition II

In the last post I talked about the bad side of much technical exposition (exemplified by Newton) and the better aspects of popular science writing (exemplified by Priestley).   These two streams have continued to the present. Stuffy, formal, impersonal technical exposition has continued to be the norm for works intended for academic credit.  Math and science expositions written for the public have been much looser and some have been remarkably good.  I described two of them in a previous post.

The revolution mentioned in the title of this post is that some aspects of the style of popular science writing have begun infiltrating writing in academic journals. Consider these sentences from Jody Azzouni’s essay in [1]:

It’s widely observed that, unlike other cases of conformity, and where social practices really are the source of that conformity, one finds in mathematical practice nothing like the variability found cuisine, clothing, or metaphysical doctrine. (p. 202).

Add two numbers fifteen times, and you do something different each time — you do fifteen different things that (if you don’t blunder) are the same in the respect needed; the sum you write down at the end of each process is the same (right) one. (p. 210).

Written material gives the reader many fewer clues as to the author’s meaning in comparison with a lecture.  Azzouni increases the comprehensibility of his message by doing things that would have been unheard of in a scholarly book on the philosophy of math thirty years ago.

  • He uses italics to emphasis the thrust of his message.
  • He uses abbreviations such as “it’s”.
  • He says “you” instead of “one”:  He does not say “If one adds two numbers fifteen times, one does something different each time…”  This phrase would probably have been nominalized to incomprehensibility thirty years ago: “A computation with fifteen repetitions of the process of numerical addition of a fixed pair of integers involves fifteen distinct actions.”

In abstractmath.org I deliberately adopt a style that is similar to Azzouni’s, including “you” instead of “one”, “it’s” instead of “it is” (and the like), and many other tricks, including bulleted prose, setting off proclamations in purple prose, and so on. (See [2].)  One difference is that I too use italics a lot (actually bold italics), but with a difference of purpose:  I use it for phrases that I think a student should mark with a highlighter.

My discussion of modus ponens from the section Conditional Assertions illustrates some of these ideas:

Method of deduction: Modus ponens

The truth table for conditional assertions may be summed up by saying: The conditional assertion “If P, then Q” is true unless P is true and Q is false.

This fits with the major use of conditional assertions in reasoning:

Method of deduction

  • If you know that a conditional assertion  is true and
  • you know that its hypothesis is true,
  • then you know its conclusion is true.

In symbols:

When “If P then Q” and P are both true,

______________________________________

then Q must be true as well.

This notation means that if the statements above the line are true, the statement below the line has to be true too.

This fact is called modus ponens and is the most used  method of deduction of all.

Remark

That modus ponens is valid is a consequence of the truth table:

  • If  P is true that means that one of the first two lines of the  truth table holds.
  • If the assertion “If P then Q” is true, then one of lines 1, 3 or 4 must hold.

The only possibility, then, is  that Q is true.

Remark

Modus ponens is not a method of proving conditional assertions. It is a method of using a conditional assertion in the proof of another assertion.  Methods for proving conditional assertions are found in the chapter on forms of proof.

This section also includes a sidebar (common in magazines) that says:  “The first statement of modus ponens does not require pattern recognition.  The second one (in purple) does require it.”

Informality, bulleted lists, italics for emphasis, highlighted text, sidebars, and so on all belong in academic prose, not just in popular articles and high school textbooks.  There are plenty of other features about popular science articles that could be used in academic prose, too, and I will talk about them in later posts.

Note: Some features of popular science should not be used in academic prose, of course, such as renaming technical concepts as I discussed in the post of that name.  An example is referring to simple groups as “atoms of symmetry”, since many laymen would not be able to divorce their understanding of the words “simple” and “group” from the everyday meanings:  “HOW can you say the Monster Group is SIMPLE??? You must be a GENIUS!”

References

[1] 18 Unconventional Essays on the Nature of Mathematics, by Reuben Hersh. Springer, 2005.  ISBN 978-0387257174

[2] Attitude, in abstractmath.org.

The revolution in technical exposition

Most of the posts on G&G are in the streams math or language.   Many articles are also in various subcategories. The articles in each stream can be found by looking to the column to the left of this post and scrolling down to “categories”.   (That word has too many meanings…)  I have added a new stream, exposition, and have put four earlier articles in the stream.  They concern expository prose in the sciences.

Old fashioned mathematical and scientific exposition appears to be designed to put as many barriers as possible in the way of the reader.  Some of its properties:

  • Highly formal
  • Full of pronouncements worded in an impersonal way (noun phrases, everything objectified)
  • All traces obliterated of how the results came to be discovered
  • No intuitive explanations

References [2] and [3] go into detail about some of these characteristics.

Steven Johnson, in the Invention of Air [1] describes the classical expository style of Isaac Newton as having these properties.  He also says that Priestley’s book [4] on electricity is in some sense the first popular science book.  It is narrative, not didactic; it uses “I” a lot; it goes into great detail about how the experiments were conducted (read his account of Benjamin Franklin’s experiments starting on page 222), including what were in his opinion the many mistakes of other researchers, and occasionally attempts intuitive descriptions of electricity.

I see that I accidentally published this post, so I will stop here and continue in another post.

References

[1] Steven Johnson, The Invention of Air.  Riverhead Books, 2008.  ISBN 9781594488528.  Reviewed in my post on Priestley.

[2] O’Halloran, K. L. (2005), Mathematical Discourse: Language, Symbolism And Visual Images. Continuum International Publishing Group.  ISBN 978-0826468574.

[3] Halliday, M. A. K. and J. R. Martin (1993), Writing Science: Literacy and Discursive Power. University
of Pittsburgh Press.  ISBN 978-0822961031

[4] Joseph Priestley, The History and Present State of Electricity, with Original Experiments (1775).

Joseph Priestley

The Invention of Air, by Steven Johnson.  Riverhead Books, 2008.  978-1-59448-852-8.   This is a biography of Joseph Priestly:

  • He discovered that, although animals put in a closed box with no source of air died pretty quickly, plants put in a similar box did not die.  This led him to conceive a primitive form of the idea of the cycle of nature. (Note 1.)
  • He discovered oxygen (apparently not really based on the previous discovery above), but did not understand what he discovered.  He continued to believe in phlogiston to the end of his life.
  • He invented soda water because he lived near a brewery.
  • He cofounded the first Unitarian Church in England and wrote extensively about the corruptions of Christianity such as the Trinity.
  • He supported America’s independence and the French Revolution.  Concerning the latter, he exhibited considerable naiveté.
  • Because of the last two things listed, a mob burned down his house and laboratory, his church and the house of one of his supporters.  In consequence he moved to America.
  • He engaged in much correspondence with Thomas Jefferson with the result that Jefferson was relieved to find that he could still consider himself a Christian, of the Unitarian variety, of course.  (Nowadays Unitarians don’t consider themselves Christian but then they did.)
  • He wrote a bunch of sharp attacks on John Adams, in particular accusing him of dastardly behavior in signing the Alien and Sedition Act, and of opposing further advances of science.  Guess which attack made Adams the most furious.  (The latter.)
  • Thomas Jefferson and John Adams were bitter enemies for many years, but engaged in an extensive and reasonably polite correspondence during the last years of their lives.  Much of the correspondence involved Adams defending himself against Priestley’s criticisms.

They never taught me all that in school!  By the way, I probably got all sorts of things wrong in the summary above.  So you’d better read the book from cover to cover.

Scientists should read this book, too; it gives them a new sense of how important they were regarded by the politicians in England, America and France, in comparison to these days.  Politicians should read this book as well, but they won’t.

Popular science

The author claims (pp. 34-35) that Priestley’s work (Note 2) explaining the wonderful new discoveries about electricity constitute the first popular science book (at least of the narrative kind.)

Note

1.  See Priestley’s Experiments and Observations on Different Kinds of Air, Volume III, Book 9, Part 1. (1790).

2.  Joseph Priestley, The History and Present State of Electricity, with Original Experiments (1775).

Introduction to Wikibook on categories

Below is my newly rewritten introduction to the Wikibook on categories.  I am posting it here because of course the Wikibook version is likely to change at any time.

== Introduction ==

This Wikibook is an introduction to category theory.  It is written for those who have some understanding of one or more branches of abstract mathematics, such as group theory, analysis or topology.  The book contains many examples drawn from various branches of math.  If you are not familiar with some of the kinds of math mentioned, don’t worry.  If practically all the examples are unfamiliar, this book may be too advanced for you.

===What is a category?===

A category is a mathematical structure, like a group or a vector space, abstractly defined by axioms.  Groups were defined in this way in order to study symmetries (of physical objects and equations, among other things).  Vector spaces are an abstraction of vector calculus.

What makes category theory different from the study of other structures is that in a sense the concept of category is an abstraction of a kind of mathematics. (This cannot be made into a precise mathematical definition!)  This makes category theory unusually self-referential and capable of treating many of the same questions that mathematical logic treats.  In particular, it provides a language that unifies many concepts in different parts of math.

In more detail, a category has objects and morphisms or arrows.  (It is best to think of the morphisms as arrows: the word “morphism” makes you think they are set maps, and they are not always set maps. The formal definition of category is given in the chapter on categories.)

  • The category of groups has groups as objects and homomorphisms as arrows.
  • The category of vector spaces has vector spaces as objects and linear maps as arrows.

The maps between categories that preserve structure are called functors.

  • The underlying set of a group determines a functor from the category of groups to the category of sets.
  • The fundamental group of a pointed space determines a functor from the category of pointed topological spaces to the category of groups.  The fact that it is a functor means that a continuous point-preserving map from a pointed space S to a pointed space T induces a group homomorphism from the fundamental group of S to the fundamental group of T.

Categories form a category as well, with functors as arrows.  Most fundamentally, functors between specific categories form a category: its morphisms are called natural transformations. The fact that category theory has natural transformations is arguably the single feature that makes category theory so important.

===History===

Category theory was invented by Samuel Eilenberg and Saunders Mac Lane in the 1940’s as a way of expressing certain constructions in algebraic topology.  Category theory was developed rapidly in the subsequent decades.  It has become an autonomous part of mathematics, studied for its own sake as well as being widely used as a unified language for the expression of mathematical ideas relating different fields.

For example, algebraic topology relates domains of interest in geometry to domains of interest in algebra. Algebraic geometry, on the other hand, goes in the opposite direction, associating, for example, with each commutative ring its spectrum of prime ideals.  These fields were among the earliest to be studied using tools of category theory.  Later applications came to abstract algebra, logic, computing science and physics, among others.

===Aspects of category theory===

Because the concept of a category is so general, it is to be expected that theorems provable for all categories will not usually be very deep. Consequently, many theorems of category theory are stated and proved for particular classes of categories.

  • Homological algebra is concerned with Abelian categories, which exhibit features suggested by the category of Abelian groups.
  • Logic is studied using topos theory: a topos is a category with certain properties in common with the category of sets but which allows the logic of the topos to be weaker than classical logic.  It is characteristic of the malleability of category theory that toposes were originally developed to study algebraic geometry.

An important use purpose of categorical reasoning is to identify within a given argument that part which is trivial and separate it from the part which is deep and proper to the particular context. For example, in the study of the theory of the GCD, the fact that it is essentially unique simply follows from the uniqueness of the product in any category and is thus really trivial. On the other hand, the fact that the GCD of the integers A and B can be expressed as a linear combination of A and B with integer coefficients—GCD(a, b) = ma + nb, for some integers m and n —is a much deeper fact that is special to a much more restricted situation.

===Note on terminology===

Most variations in terminology are discussed in the place where the terminology is defined.  Here it is important to point out one annoying terminological problem:  The adjective corresponding to “category” is “categorical”.  Since “categorical” in logic means having only one model up to isomorphism, this can cause cognitive dissonance; in any case, the use of “categorical” in this book has nothing to do with the idea of having only one model.

Some authors use “categorial” instead.  Unfortunately, this means something else in linguistics.  This book follows majority usage with “categorical”.

Must, have to, gotta

A volunteer helping in an intermediate-level ESL class reports that one day the teacher introduced “must” and  “have to”, in contexts such as

  • You must renew your visa = You have to renew your visa.
  • You must have a ticket to get into the show  = You have to have a ticket to get into the show.

In the volunteer’s discussion group this provoked two phenomena:

1.  A heated discussion about “have to have”.  Many students thought that was crazy and couldn’t figure out what it meant.  They didn’t think “have to renew” was crazy, but the usage was unfamiliar to many of them.

2. Partway through the discussion in the subgroup moderated by the volunteer, a student suddenly Saw The Light:  “They’re talking about GOTTA!”  (You gotta renew your visa.  You gotta have a ticket to get into the show.)

“Must” “have to” and “gotta” occur in three different registers of English.  In America, in my experience, “must” is uncommon in speech and occurs mostly in formal writing.  “Have to” (or “hafta”)  is informal and widely used in both speech and writing.  In street-conversation, “gotta” is the usual usage.  It is uncommon in writing.  “You gotta” would be spelled “You’ve got to”.  (You do hear “you’ve gotta” as well as “you gotta”.)

New immigrants are exposed to English in the work place and on the street, not in the home and not usually in formal circumstances.  The teacher should have given “gotta” as a third alternative way of saying “must” right from the start, since clearly that is the term most familiar to most of them.  She should probably have also pointed out the pronunciation “hafta”, which is not obvious from the “have to” spelling unless you are a Sophisticated Amateur Linguist like me.

PS

I should add that negating these expressions introduces complications.  “Must not” does not mean the same thing as “don’t have to”.  “Don’t got” is considered wrong, and plenty of people who say “gotta” in conversation, including me, don’t say “don’t gotta”; I would say “don’t have to” or “don’t need to” instead.

Definitions into mathematical objects 4

This is the fourth post in a series, continuing TDMO1, TDMO2 and TDMO3.  This series builds up to an explanation of the concept of form in the paper Graph-Based Logic and Sketches by Atish Bagchi and me.

What has happened so far

I have defined special cases of sketch and the concept of the cattheory of a sketch for those cases. The special cases include

  • The sketch is a digraph (DM, page 218).  Then the category of models is a topos and and cattheory is the free category generated by the digraph.
  • The sketch is a digraph with some formally commutative diagrams.  Then the models are universal algebras of a specific type with only unary operations (Note 1) and the cattheory is the free category generated by the digraph modulo the commutative diagrams.

In this post I will show how to require sketches to preserve categorical limits, starting with products.

Products

A (binary) product diagram in a category is a cone to {a,b}, in other words a diagram that looks like this

(1)TwistedProductDiagram

with this specific universal property:  For any other cone to {a,b}

NoFiProduct

(2)

there is a unique fill-in arrow fi such that this diagram commutes (see Note 2):

FiProduct

(3)

This diagram has been called a sawhorse. But like most metaphors in math, this name is misleading.  The sawhorse is not symmetrical with respect to its two pairs of legs.  The black legs form a product cone but the blue legs need not be a product diagram. It could be called a directed sawhorse.

Another useful way of thinking about diagram (3)  is that the construction of the fill-in arrow is like an algebraic operation whose domain is the set of diagrams of the form (2) for a fixed product diagram (1)  and whose output is the fill-in arrow.   Now ordinary algebraic operations have domain some cartesian product of sets with output in a set.  For example scalar multiplication of a real vector space is an operation \mathbb{R}\times V\longrightarrow V.  The fill-in arrow is an operation on a set of tuples of arrows, but the domain is not the cartesian products of arrow sets; instead it is an equationally defined subset of a cartesian product of arrow sets.

By making the blue cone also a product diagram you get an instant proof that the resulting fi is an isomorphism.  Thus “products in a category are determined up to a unique isomorphism”.

Reference [1], Chapter 5 gives a detailed explanation of products in categories at a rather elementary level.

Finite products

Finite products of more than two objects in a category can be define analogously.  However, if you assume you have all products of two objects then you automatically get all finite products of two or more elements.  It is an easy exercise to see that every category has all products of one object.  A product of no objects is a terminal object and that has to be assumed separately.  An example of this is the category you get if the singleton groups are untimely ripp’d from the category of groups — it has all finite products but no terminal object.

A category has finite products if there is a product diagram for any finite set of objects.

FP sketches

An FP sketch is a digraph together with some  specified formally commutative diagrams and some specified formal product diagrams.  A model of an FP sketch in a category \mathcal{C} is a functor from the digraph to \mathcal{C} that takes the specified formally commutative diagrams to commutative diagrams and the specified formal product diagrams to product diagrams.

The category of all algebras for any specified type of universal algebra (with finitary operations) is  equivalant to the category of models of an FP sketch.  Chapter 4 of [2] describes an FP theory for the category of groups starting on page 126.

The homomorphisms in the category of models of an FP sketch must preserve the designated product diagrams on the nose.  The sketch for groups just mentioned has three designated formal product cones, for the terminal object, G^2 and G^3This is just a technicality although some mathematicians make a big deal out of it.  In fact when mathematicians talk about the “category of groups” they don’t usually even say which product cone they mean by G^2 and G^3.  There is a neat way to handle this using categories enriched over groupoids, but never mind.

Next: finite limits!

Notes

  1. Universal algebras of a specific type with only unary operations (not even constants allowed) are a topos, but in general universal algebras of a specific type (Birkhoff varieties) do not form a topos.  See Johnstone, P. T., When is a variety a topos? Algebra Universalis 21 (1985), 198–212.
  2. Why O Why does the conversion of these diagrams from PDF to JPEG darken the blue so it doesn’t match the text???

References

  1. Michael Barr and Charles Wells, Category theory for computing science (1999).  Les Publications CRM, Montreal (publication PM023).
  2. Michael Barr and Charles Wells, Toposes, Triples and Theories (2005).  Reprints in Theory and Applications of Categories 1.

Definitions into mathematical objects 3

This is the third post in a series, continuing TDMO1 and TDMO2, aimed at explaining the concept of forms in Graph-Based Logic and Sketches.

Clones and theories

Sketches as I have described them so far can now describe some kinds of universal algebras, namely those with unary operations and equations.  n-ary operations and more elaborate constructions will come in a later post.

The first big construction in universal algebra is that of the clone of a type of algebra (commonly called a signature): a specific set of n-ary operations for various n and specific equations involving those operations.  The clone is essentially all the expressions you can create from the operations, requiring two expressions to be equivalent if you can prove they are equivalent given the equations of the algebra.   (For example, in the clone for semigroups, x(y(xy)) is equivalent to (xy)(xy).  On the other hand, xy is not equivalent to yx because there are noncommutative semigroups, so xy = yx can’t possibly follow from the equations.)

The Lawvere theory of the algebra is a different construction which essentially expresses the clone as a special kind of category, with models of the theory being product-preserving functors.

In model theory a first-order theory is an extension of the concept of clone that allows relations other than equality, as well as the use of negation and quantifiers.  First order theories are constructed in a different way from clones and Lawvere theories but they capture the same general idea in the bigger context.

The cattheory of a sketch

For a sketch, the idea equivalent to the clones and theories just described is that of a cattheory. (This is a new coinage.  See Terminology below.)

Suppose we have a linear sketch \mathcal{S} = (\mathbf{G},D) where G is a digraph and D is a set of formally commutative diagrams.  The cattheory of the linear sketch, denoted by \text{CatTh}(\mathcal{S}), is the free category generated by G with the condition imposed that the diagrams in D must become commutative.    The digraph morphism G:\mathbf{G}\to \text{CatTh}(\mathcal{S}) that takes the nodes and arrows in G to the corresponding objects and arrows of the free category is by definition a model of the sketch \mathcal{S}. G is called the generic model of the sketch.

A model of the cattheory \text{CatTh}(\mathcal{S}) in a category \mathcal{C} is simply a functor.  We don’t have to impose properties on the functor because functors automatically preserve commutative diagrams.  When we get into more complicated structures we will have to add preservation requirements on the model functor.  (See note 1.)

The category of models of the linear sketch \mathcal{S} in a category \mathcal{C} is denoted by \mathbf{Mod}\left( \mathcal{S},\mathcal{C} \right).  The category of models of the cattheory \text{CatTh}(\mathcal{S}) in \mathcal{C} is denoted by \mathbf{Mod}\left( \text{CatTh}(\mathcal{S}),\mathcal{C} \right).

The cattheory has this universal property:

Theorem The map M\mapsto GM:\mathbf{Mod}\left( \mathcal{S},\mathcal{C} \right)\to \mathbf{Mod}\left( \text{CatTh}(\mathcal{S}),\mathcal{C} \right)

is an equivalence of categories. (Note 2).

This theorem forces the cattheory to be determined up to natural equivalence of categories that commutes with the generic model.  For all practical purposes, a model of the sketch is thus the same thing as a model of its cattheory.  This will remain true as we go up the hierarchy of new constructions (n-ary operations, limits and colimits, and other things) and is the defining property of the cattheory.   Clones, Lawvere theories and first order theories are all cattheories up to equivalence.

How to think about cattheories

\text{CatTh}(\mathcal{S})  may be thought of as the minimal category that contains a model of \mathcal{S}, the model being the generic model G. Every model M of \mathcal{S} in any category \mathcal{C} must induce a unique (up to natural isomorphism) model of \text{CatTh}(\mathcal{S}) in \mathcal{C}, simply because everything in \mathcal{S} must be there because any category containing a model of \mathcal{S}  must have everything in \text{CatTh}(\mathcal{S}).

For example, consider the sketch for endofunctions in TDMO2.    Because s is in the graph of the sketch, every power of M(s) must be in any category \mathcal{C} containing a model M of the sketch.  This forces by induction a unique functor (model) from the cattheory of the sketch to \mathcal{C}.  (Usually the induced functor is unique only up to natural isomorphism, but this time it is rilly rilly unique.)

Terminology

These remarks apply in general to all kinds of sketches, not just the restricted version we are considering here.

The situation in the literature is confusing.

  • In [2] and in the other books and articles by Michael Barr and/or me, we call the cattheory the theory.
  • In [3], we refer to the cattheory of a finite limit sketch as a theory or as a categorial theory in an attempt to distinguish it from some version or other of logical theory.  We said “categorial theory” instead of “categorical theory” because the latter phrase means something different to logicians.  Unfortunately, “categorial” means something different to linguists!
  • In [1], Johnstone calls the cattheory of a sketch the syntactic category.
  • In [3], page 33, we refer to SynCat[ f ], where f is a form (generalized sketch).  Now f has a cattheory, but it is not SynCat[ f ], which is the cattheory correspond to the doctrine that f belongs to (the type of categories it can have models in.)

I am using “cattheory” so that I will have a neologism that doesn’t mean anything different to anybody.  Personally, I think we should keep using “theory”, because the theory (syntactic category in the sense of Johnstone) is ultimately the same thing (satisfies the same universal property) as the corresponding logical theory; the difference is only in the construction.

Notes

  1. When we have more structure to talk about in the models, we will have to distinguish which cattheory we are talking about for a sketch.  For the linear sketch \mathcal{S}we will refer to \text{CatTh}(\text{LS}, \mathcal{S}), where “LS” refers to linear sketches.  That is the doctrine this particular sketch belongs to.
  2. You have to say what the map M\mapsto GM:\mathbf{Mod}\left( \mathcal{S},\mathcal{C} \right)\to \mathbf{Mod}\left( \text{CatTh}(\mathcal{S}),\mathcal{C} \right)does to morphisms of models, which are natural transformations (remember the models are functors.)   This is easy.

References

  1. 1. Sketches of an Elephant: A Topos Theory Compendium, Volume 2 (Oxford Logic Guides 44), by Peter T. Johnstone.  Oxford University Press, ISBN 978-0198524960.
  2. 2. Toposes, Triples and Theories (online edition), by Michael Barr and Charles Wells. All references to page numbers refer to the online edition, not to the Springer volume, which should be forgotten forever.
  3. 3. Graph-Based Logic and Sketches, Atish Bagchi and Charles Wells (latest draft September, 2008).

Commonword names for technical concepts

In a previous post I talked about the use of commonword names for technical concepts, for example, “simple group” for a group with no proper normal subgroups.  This makes the monster group a simple group!  Lay readers on the subject might very well feel terminally put-down by such usage.  (If he calls that “simple” he must be a genius.  How could I ever understand that?  See note 1.)  Mark Ronan used of “atom of symmetry” instead of “simple group” in his book Symmetry and the Monster, probably for some such reason.

Recently I had what used to be called a CAT scan and (perhaps) what used to be called a PET scan on the same day.   The medically community now refers to CT scan or nuclear imaging.   This may be because too many clients were thinking of doing sadistic testing on cats or other pets.   But I have not been able to confirm that.

The nurse called the CT scan an x-ray.  Well, of course, it is an x-ray, but it is an x-ray with tomography.  She explicitly said that calling CT scans x-rays was common usage in their lab.  In the past, other medical people have said to me, “It used to be called CAT scan but now it is CT scan.”   But no one said why.

The situation about PET scan is more complicated.  I didn’t raise the question with the nurse, and Wikipedia has separate articles about PET scans and nuclear imaging, even though they both use positrons and tomography.   The chemicals mentioned for PET are isotopes of low-atomic-number elements, whereas the nuclear medicine article mentions technetium99 as the most commonly used isotope.  Nowhere does it explain the difference.  I wrote a querulous note in the comments section of the NM article requesting clarification.

Note 1.  ”If he calls that ’simple’ he must be a genius.  How could I ever understand that?”   Do not dismiss this as the reaction of a stupid person.  This kind of ready-to-be-intimidated attitude is very common among intelligent, educated, but non-technically-oriented people.   If mathematicians dismiss people like that we will  continue to find mathematics anathema among educated people.  We need people to feel that they understand something about what mathematicians do (I use that wording advisedly).  Even if you are an elitist who doesn’t give a damn about ordinary people, remember who funds the NSF. See co-intimidator.

Mathematical concepts

This post was triggered by John Armstrong’s comment on my last post.

We need  to distinguish two ideas: representations of a mathematical concept and the total concept.  (I will say more about terminology later.)

Example: We can construct the quotient of the kernel of a group homomorphism by taking its cosets and defining a multiplication on them.  We can construct the image of the homomorphism by take the set of values of the homomorphism and using the multiplication induced by the codomain group.   The quotient group and the image are the same mathematical structure in the sense that anything useful you can say about one is true of the other.   For example, it may be useful to know the cardinality of the quotient (image) but it is not useful to know what its elements are.

But hold on, as the Australians say, if we knew that the codomain was an Abelian group then we would know that the quotient group was abelian because the elements of the image form a subgroup of the codomain. (But the Australians I know wouldn’t say that.)

Now that kind of thinking is based on the idea that the elements of the image are “really” elements of the codomain whereas elements of the quotients are “really” subsets of the domain.  That is outmoded thinking.  The image and the quotient are the same in all important aspects because they are naturally isomorphic.   We should think of the quotient as just as much as subgroup of the codomain as the image is.  John Baez (I think) would say that to ask whether the subgroup embedding is the identity on elements or not is an evil question.

Let’s step back and look at what is going on here.  The definition of the quotient group is a construction using cosets.  The definition of the image is a construction using values of the homomorphism.  Those are two different specific  representations of the same concept.

But what is the concept, as distinct from its representations?  Intuitively, it is

  • All the constructions made possible by the definition of the concept.
  • All the statements that are true about the concept.

(That is not precise.)

The total concept is like the clone plus the equational theory of a specific type of algebra in the sense of universal algebra.  The clone is all the operations you can construct knowing the given signature and equations and the equational theory is the set of all equations that follow from them.  That is one way of describing it.  Another is the monad in Set that gives the type of algebra — the operations are the arrows and the equations are the commutative diagrams.

Note: The preceding description of the monad is not quite right.  Also the whole discussion omits mention of the fact that we are in the world (doctrine) of universal algebra.  In the world of first order logic, for example, we need to refer to the classifying topos of the category of algebras of that type (or to its first order theory).

Terminology

We need better terminology for all this.  I am not going to propose better terminology, so this is a shaggy dog story.

Math ed people talk about a particular concept image of a concept as well as the total schema of the concept.

In categorical logic, we talk about the sketch or presentation of the concept vs. the theory. The theory is a category (of the kind appropriate to the doctrine) that contains all the possible constructions and commutative diagrams that follow from the presentation.

In this post I have used “total concept” to refer to the schema or theory.  I have referred the particular things as  “representations” (for example construct the image of a homomorphism by cosets or by values of the homomorphism).

“Representation” does not have the same connotations as “presentation”.  Indeed a presentation of a group and a representation of a group are mathematically  two different things.  But I suspect they are two different aspects of the same idea.

All this needs to be untangled.  Maybe we should come up with two completely arbitrary words, like “dostak” and “dosh”.

Different names for the same thing

I recommend reading the discussion (to which I contributed) of the post “Why aren’t all functions well-defined?” on Gower’s Weblog.   It resulted in an insight I should have had a long time ago.

I have been preaching the importance of different ways of thinking about a math object (different images, metaphors, mental representations — there are too many names for this in the math ed literature).   Well, mathematicians at least occasionally use different names for a type of math object to indicate how they are thinking about it.

Examples

We talk about a relation and we talk about multivalued functions. Those are two different ways of talking about the same thing (they are the same by an adjunction).   A relation is a predicate.  A multivalued function is a function except that it can have more than one output for a given input.  But they are the same thing.

We talk about an equivalence relation and we talk about a partition of a set (or a quotient set).  The category of equivalence relations and the category of partitions of sets are naturally isomorphic, not merely equivalent.  But one is a special kind of relation and the other is a grouping.

Let’s be open about what we do

We should be explicit about the way we think about and do math.  We have several different ways to think about any interesting type of math object and we should push this practice to students as being absolutely vital.  In particular we (some of us) use different names sometimes for the same object and we refuse to give them up, muttering about “reductionism” and “nothing buttery”.

Some students arrive in class already as (pedantic?)(geeky?) as many mathematicians (I am a recovering pedant myself).  We need to be up front about this phenomenon and explain the value of thinking and talking about the same thing in different ways, even using different words.

It used to be different but now it’s the same

A kind of opposite phenomenon occurs with some students and mathematicians of a certain personality type.  Consider the name “multivalued function”.  Of course a multivalued function is not (necessarily) a function.  Your mother-in-law  is not your mother, either.  I go on about this (using ideas from Lakoff) in the Handbook under “radial concept”.   Pedantic types can’t stand this kind of usage.  “A multivalued function can’t be a function”.  “Equivalence relations and partitions are not the same thing because one is a relation and the other is a set of sets.”  “The image of a homomorphism and the quotient by its kernel are not the same thing because…”

This attitude makes me tired.  Put your hands on the tv screen and think like a category theorist.

Definitions into mathematical objects 2

Introduction

This post continues the discussion of sketches begun in the post Turning Definitions into Mathematical Objects (TDMO1). This series is building up to an explanation of the ideas in the monograph Graph-Based Logic and Sketches, by Atish Bagchi and me. In the previous post, I introduced the idea of a sketch, which is in its simplest form a directed graph (digraph), and a model of a sketch, which is a functor from the digraph to the category of sets. A morphism of models is then a natural transformation between models.

In TDMO1, I described the sketch for directed graphs. We need another example.

The sketch for an endofunction

Loop

(1)

Let \mathcal{L} be the digraph at the left. It has one node c and one arrow s, whose source and target are (necessarily) c. A model of this sketch is a set C and a function from C to itself. Any such function determines a cyclic semigroup of transformations of C. For later use, I want to mention the the particular model in which C is the set of natural numbers and s is the successor function: this is the free semigroup on one letter.

You may want to experiment with other digraphs as sketches. Any digraph produces a category of models in sets. Each such category is a category of presheaves, hence is a topos ([1], p. 67, Theorem 2.4).

Imposing equations

We need to expand the idea of sketch to be able to define more kinds of structure (see “Specifying more kinds of structure” in TDMO1). Let’s start with equations. Suppose we take the sketch for an endofunction and want to modify it so that s^3=1 in every model in Set. This means we must require that the diagram (in the category of sets) below commute for every model M.

ThreeSTo say that this diagram commutes requires that M(s)\circ M(s)\circ M(s)=\text{id}_{M(c)} (see Note 1.) The obvious way to do this is to say that we require that a model is a functor from the digraph (1) to the category of sets that has the property that the image of the digraph

BareTriangleunder M is a commutative diagram in the category of sets. We say that this digraph is formally commutative. In general, in sketch theory, something is formally P if in every model of the sketch its image is required to be P.

This gives us a more general notion of sketch which allows the specification of equations, although so far only between unary operations. Formally, a linear sketch (\mathbf{G}, \cal{D}) consists of a digraph G together with a set \cal{D} of zero or more formally commutative diagrams. It should be apparent that an linear sketch can specify anything that a universal algebra signature with only unary operations can specify. But linear sketches can specify much more than that because they can specify multisorted algebras.

Models in arbitrary categories

A model in Set of a linear sketch (\mathbf{G}, \cal{D}) is a functor M:\mathbf{G}\rightarrow\textbf{Set} with the property that M takes every diagram in to a commutative diagram in Set.  (Note 2).

This definition would still be meaningful if we replaced Set by any category whatever. So now we can talk about a model of (\mathbf{G}, \cal{D}) in any category C. For example, a model in the category of groups of the endofunction sketch above that requires the cube to be the identity function is simply a group with a specified automorphism of order 1 or 3. Indeed, in any category of structures of a certain type, a model is a structure of that type with a specified automorphism of order 1 or 3.

The more elaborate sketches we construct later will still allow models in any category. For example, a model of the sketch for groups (see reference [1], starting on page 126) in the category of Hausdorff spaces is precisely a Hausdorff topological group. Ancient cute theorem: A model of the sketch for groups in the category of groups is an Abelian group. That’s because the group operations must be homomorphisms!

Next we will look at what corresponds in universal algebra to clones.

Notes

1. Saying exactly what “this diagram commutes” means for any particular  exhibited diagram requires fussiness. Chapter 2 of GBLS goes into excruciating detail about this.

2. A model of a linear sketch in the category of sets need not be a topos.

3. A model of a digraph (sketch without equations) in an arbitrary category is not in general a topos.  But a model of a digraph in a topos is a topos.

References

[1] Michael Barr and Charles Wells, Toposes, Triples and Theories. All references to page numbers refer to the online edition, not to the Springer volume, which should be forgotten forever.

[2] Atish Bagchi and Charles Wells, Graph-Based Logic and Sketches.

[3] Charles Wells, Sketches: Outline with references. Warning: This paper contains errors and omissions and will be revised Real Soon Now.

Distributive plurals

A statement in English such as “all squared nonzero real numbers are positive” is called a distributive plural.  This means that the statement “the square of x is positive” is true for every nonzero real number.  It can be translated directly into symbolic notation:  \forall x\,(\text{if }x\ne 0\text{ then }{{x}^{2}}>0)

Not all statements involving plurals in English are distributive plurals.  The statement “The agents are surrounding the building” does not imply that Agent James is surrounding the building.  This type of statement is called a collective plural. Such a statement cannot be translated directly into a statement involving a universal quantifier.  More about this here.  This discussion on Wordwizard suggests that there may be a difference between British and American usage.

The word “distributive” as used here is analogous to the distributive law of arithmetic.  If the set of things referred to is finite, for example the set {-2, -1, 1, 3} then one can say  that “\forall x\,({{x}^{2}}>0)” is equivalent to “{{(-2)}^{2}}>0\text{ and }{{(-1)}^{2}}>0\text{ and }{{\text{1}}^{2}}>0\text{ and }{{\text{3}}^{2}}>0”.

I once found a report on the internet that a Quaker Oats box contained this exhortation: “Eating a good-sized bowl of Quaker Oatmeal for 30 days will actually help remove cholesterol from your body.”  This undoubtedly exhibits a confusion between distributive plurals and the other kind of plural, but I don’t understand the connection well enough to explain it.

I can no longer find the report on the internet.  This may mean the Quaker Oats box with that label never existed.