Honest Toil: case studies

Showing posts with label case studies. Show all posts

Monday, October 27, 2014

Two new papers on abstract (mathematical) explanation

There has not been much activity here lately, but I wanted to link to two new papers of mine that tackle the vexing issue of mathematical explanation in math and in science. I try to isolate a kind of "abstract" explanation using two cases, and explore their significance.

The Unsolvability of the Quintic: A Case Study in Abstract Mathematical Explanation
Philosophers' Imprint, forthcoming.
Abstract: This paper identifies one way that a mathematical proof can be more explanatory than another proof. This is by invoking a more abstract kind of entity than the topic of the theorem. These abstract mathematical explanations are identified via an investigation of a canonical instance of modern mathematics: the Galois theory proof that there is no general solution in radicals for fifth-degree polynomial equations. I claim that abstract explanations are best seen as describing a special sort of dependence relation between distinct mathematical domains. This case study highlights the importance of the conceptual, as opposed to computational, turn of much of modern mathematics, as recently emphasized by Tappenden and Avigad. The approach adopted here is contrasted with alternative proposals by Steiner and Kitcher.

Abstract Explanations in Science
British Journal for the Philosophy of Science, forthcoming.
A previous version of this paper is online here.
Abstract: This paper focuses on a case that expert practitioners count as an explanation: a mathematical account of Plateau's laws for soap films. I argue that this example falls into a class of explanations that I call abstract explanations. Abstract explanations involve an appeal to a more abstract entity than the state of affairs being explained. I show that the abstract entity need not be causally relevant to the explanandum for its features to be explanatorily relevant. However, it remains unclear how to unify abstract and causal explanations as instances of a single sort of thing. I conclude by examining the implications of the claim that explanations require objective dependence relations. If this claim is accepted, then there are several kinds of objective dependence relations.

It remains to be seen if this "ontic" approach is the best way to go, but I believe it is a promising avenue to explore.

Thursday, July 14, 2011

The unplanned impact of mathematics (Nature)

Peter Rowlett has assembled, with some other historians of mathematics, seven accessible examples of how theoretical work in mathematics led to unexpected practical applications. His discussion seems to be primarily motivated by the recent emphasis on the "impact" of research, both in Britain and in the US:

There is no way to guarantee in advance what pure mathematics will later find application. We can only let the process of curiosity and abstraction take place, let mathematicians obsessively take results to their logical extremes, leaving relevance far behind, and wait to see which topics turn out to be extremely useful. If not, when the challenges of the future arrive, we won't have the right piece of seemingly pointless mathematics to hand.

For philosophers, the most important example to keep in mind, I think, is the last one, offered by Chris Linton: the role of Fourier series in promoting the later "rigorization" of math:

In the 1870s, Georg Cantor's first steps towards an abstract theory of sets came about through analysing how two functions with the same Fourier series could differ.

Rowlett has a call for more examples on the BSHM website. Hopefully this will convince some funding agencies that immediate impact is not a fair standard!

Thursday, October 7, 2010

Okasha Takes On The Inclusive Fitness Controversy

In a helpful commentary in the current issue of Nature Samir Okasha summarizes the recent dispute about inclusive fitness. In an article from earlier this year E. O. Wilson and two collaborators argued that inclusive fitness (or kin selection) was dispensable from an explanation of altruistic behavior. For my purposes what is most interesting about this debate is that the Wilson argument depends on an alternative mathematical treatment which seems to get rid of the need for anything to track inclusive fitness. As a result, inclusive fitness is seen merely as a book-keeping device with no further explanatory significance.

Okasha suggests that the dispute is overblown and that each of the competing camps should recognize that a divergence in mathematical treatment need not signal any underlying disagreement. As he puts it at one point

Much of the current antagonism could easily be resolved — for example, by researchers situating their work clearly in relation to existing literature; using existing terminology, conceptual frameworks and taxonomic schemes unless there is good reason to invent new ones; and avoiding unjustified claims of novelty or of the superiority of one perspective over another.

It is strange that such basic good practice is being flouted. The existence of equivalent formulations of a theory, or of alternative modelling approaches, does not usually lead to rival camps in science. The Lagrangian and Hamiltonian formulations of classical mechanics, for example, or the wave and matrix formulations of quantum mechanics, tend to be useful for tackling different problems, and physicists switch freely between them.

This point is right as far as it goes, but my impression is that some biologists and philosophers of biology over-interpret the concept of fitness. If Wilson et. al. are correct, then there is simply no need to believe that inclusive fitness tracks any real feature of biological systems. And this interpretative result would be significant for our understanding of altruism and natural selection more generally.

Wednesday, April 28, 2010

Mathematical Explanation in the NYRB

In his recent review of Dawkins' Oxford Book of Modern Science Writing Jeremy Bernstein characterizes one entry as follows:

W.D. Hamilton’s mathematical explanation of the tendency of animals to cluster when attacked by predators.

The article in question is "Geometry for the Selfish Herd", Journal of Theoretical Biology 31 (1971): 295-311. (Online here.) Given the ongoing worries about the existence and nature of mathematical explanations in science, it is worth asking what led Bernstein to characterize this explanation as mathematical?

The article summarizes two models of predation which are used to support the conclusion that the avoidance of predators "is an important factor in the gregarious tendencies of a very wide variety of animals" (p. 298). The first model considers a circular pond where frogs, the prey, are randomly scattered on the edge. The predator, a single snake, comes to the surface of the pond and strikes whichever frog is nearest. Hamilton introduces a notion of the domain of danger of a frog which is the part of the pond edge which would lead to the frog being attacked. Hamilton points out that the frogs can reduce their domains of danger by jumping together. In this diagram the black frog jumps between two other frogs:

So, "selfish avoidance of a predator can lead to aggregation."

In the slightly more realistic two-dimensional case Hamilton generalizes his domains of danger to polygons whose sides result from bisecting the lines which connect the prey:

Hamilton notes that it is not known what the general best strategy is here for a prey organism to minimize its domain of danger, but gives rough estimates to justify the conclusion that moving towards ones nearest neighbor is appropriate. This is motivated in part by the claim that "Since the average number of sides is six and triangles are rare (...), it must be a generally useful rule for a cow to approach its nearest neighbor."

So, we can explain the observed aggregation behavior using the ordinary notion of fitness and an appeal to natural selection. What is the mathematics doing here and why might we have some sort of specifically mathematical explanation? My suggestion is that the mathematical claim that strategy X minimizes (or reliably lowers) the domain of danger is a crucial part of the account. Believing this claim and seeing its relevance to the aggregation behavior is essential to having this explanation. Furthermore, this seems like a very good explanation. What implications this has for our mathematical beliefs remains, of course, a subject for debate.

Tuesday, October 6, 2009

Mathematics, Financial Economics and Failure

In a recent post I noted Krugman's point about economics being seduced by attractive mathematics. Since then there have been many debates out there in the blogosphere about the failures of financial economics, but little discussion of the details of any particular case. I want to start that here with a summary of how the most famous model in financial economics is derived. This is the Black-Scholes model, given as (*) below. It expresses the correct price V for an option as a function of the current price of the underlying stock S and the time t.

My derivation follows Almgren, R. (2002). Financial derivatives and partial differential equations. American Mathematical Monthly, 109: 1-12, 2002.

In my next post I aim to discuss the idealizations deployed here and how reasonable they make it to apply (*) in actual trading strategies.

A Derivation of the Black-Scholes Model

A (call) option gives the owner the right to buy some underlying asset like a stock at a fixed price K at some time T. Clearly some of the factors relevant to the fair price of the option now are the difference between the current price of the stock S and K as well as the length of time between now and time T when the option can be exercised. Suppose, for instance, that a stock is trading at 100$ and the option gives its owner the right to buy the stock at 90$. Then if the option can be exercised at that moment, the option is worth 10$. But if it is six months or a year until the option can be exercised, what is a fair price to pay for the 90$ option? It seems like a completely intractable problem that could depend on any number of factors including features specific to that asset as well as an investor's tolerance for risk. The genius of the Black-Scholes approach is to show how certain idealizing assumptions allow the option to be priced at V given only the current stock price S, a measure of the volatility of the stock price σ , the prevailing interest rate r and the length of time between now and time T when the option can be exercised. The only unknown parameter here is σ , the volatility of the stock price, but even this can be estimated by looking at the past behavior of the stock or similar stocks. Using the value V computed using this equation a trader can execute what appears to be a completely risk-free hedge. This involves either buying the option and selling the stock or selling the option and buying the stock. This position is apparently risk-free because the direction of the stock price is not part of the model, and so the trader need not take a stand on whether the stock price will go up or down.

The basic assumption underlying the derivation of (*) is that markets are efficient so that ``successive price changes may be considered as uncorrelated random variables" (Almgren, p. 1). The time-interval between now and the time T when the option can be exercised is first divided into N -many time-steps. We can then deploy a lognormal model of the change in price δ S_j at time-step j :

δ S_j = a δ t + σ S ξ_j

The ξ_j are random variables whose mean is zero and whose variance is 1 (Almgren, p. 5). Our model reflects the assumption that the percentage size of the random changes in S remains the same as S fluctuates over time (Almgren, p. 8). The parameter a indicates the overall ``drift" in the price of the stock, but it drops out in the course of the derivation.

Given that V is a function of both S and t we can approximate a change in V for a small time-step &delta t using a series expansion known as a Taylor series

δ V = V_t δ t + V_s δ S + 1/2 V_{SS} δ S^2

where additional higher-order terms are dropped. Given an interest rate of r for the assets held as cash, the corresponding change in the value of the replicating portfolio Π = DS+C of D stocks and C in cash is

δ Π = Dδ S + r C δ t

The last two equations allow us to easily represent the change in the value of a difference portfolio which buys the option and offers the replicating portfolio for sale. The change in value is

δ(V-Π)=(V_t - rC)δ t + (V_S - D)δ S + 1/2 V_{SS} δ S^2

The δ S term reflects the random fluctuations of the stock price and if it could not be dealt with we could not derive a useful equation for V . But fortunately the δ S term can be eliminated if we assume that at each time-step the investor can adjust the number of shares held so that

D=V_S

Then we get

δ(V-Π)=(V_t - rC)δ t + 1/2 V_{SS} δ S^2

The δ S^2 remains problematic for a given time-step, but we can find it for the sum of all the time-steps using our lognormal model. This permits us to simplify the equation so that, over the whole time interval Δ t ,

Δ(V-&Pi) = (V_t - rC + 1/2 σ^2 S^2 V_{SS})Δ t

Strictly speaking, we are here applying a result known as Ito's Lemma.

What is somewhat surprising is that we have found the net change in the value of the difference portfolio in a way that has dropped any reference to the random fluctuations of the stock price S . This allows us to deploy the efficient market hypothesis again and assume that Δ(V-Π) is identical to the result of investing V-Π in a risk-free bank account with interest rate r . That is,

Δ(V-Π) = r (V-Π)Δ t

But given that V-Π = V - DS - C and D = V_S , we can simplify the right-hand side of this equation to

(rV - rV_S S - rC)Δ t

Given our previous equation for the left-hand side, we get

(*) V_t + 1/2 σ^2 S^2 V_{SS} + rSV_S - rV = 0

after all terms are brought to the left-hand side.

Tuesday, September 8, 2009

Krugman on Mathematics and the Failure of Economics

Probably anyone who is interested in this article has already seen it, but Paul Krugman put out an article in Sunday's New York Times Magazine called "How Did Economics Get It So Wrong?". The article is very well-written, but a bit unsatisfying as it combines Krugman's more standard worries about macroeconomics with a short attack on financial economics. I am trying to write something right now about the ways in which mathematics can lead scientists astray, and one of my case studies in the celebrated Black-Scholes model for option pricing. Hopefully I can post more on that soon, but here is what Krugman says about it and similar models which are used to price financial derivatives and devise hedging strategies.

My favorite part is where Krugman says "the economics profession went astray because economists, as a group, mistook beauty, clad in impressive-looking mathematics, for truth". But he never really follows this up with much discussion of the mathematics or why it might have proven so seductive. Section III attacks "Panglossian Finance", but this is presented as if it assumes "The price of a company's stock, for example, always accurately reflects the company's value given the information available". But, at least as I understand it, this is not the "efficient market hypothesis" which underlies models like Black-Scholes. Instead, this hypothesis makes the much weaker assumption that "successive price changes may be considered as uncorrelated random variables" (Almgren 2002, p. 1). This is the view that prices over time amount to a "random walk". It has serious problems as well, but I wish Krugman had spent an extra paragraph attacking his real target.

Almgren, R. (2002). Financial derivatives and partial differential equations.
American Mathematical Monthly, 109: 1-12, 2002.

Friday, August 14, 2009

Computer Simulations Support Some New Mathematical Theorems

The current issue of Nature contains an exciting case of the productive interaction of mathematics and physics. As Cohn summarizes here, Torquato and Jiao use computer simulations and theoretical arguments to determine the densest way to pack different sorts of polyhedra together in three-dimensional space:

To find their packings, Torquato and Jiao use a powerful simulation technique. Starting with an initial guess at a dense packing, they gradually modify it in an attempt to increase its density. In addition to trying to rotate or move individual particles, they also perform random collective particle motions by means of deformation and compression or expansion of the lattice's fundamental cell. With time, the simulation becomes increasingly biased towards compression rather than expansion. Allowing the possibility of expansion means that the particles are initially given considerable freedom to explore different possible arrangements, but are eventually squeezed together into a dense packing.

A central kind of case considered is the densest packings of the Platonic solids. These are the five polyhedra formed using only regular polygons of a single sort, where the same number of polygons meet at each vertex: tetrahedron, icosahedron and octahedron (all using triangles), cube (using squares) and dodecahedron (using pentagons).

Setting aside the trivial case of the cube, Torquato and Jiao argue that the densest packing for the icosohedron, octahedron and dodecahedron all have a similar feature. This is that the result from a simple lattice structure known as the Bravais lattice. Again, using Cohn's summary:

In such arrangements, all the particles are perfectly aligned with each other, and the packing is made up of lattice cells that each contain only one particle. The densest Bravais lattice packings had been determined previously, but it had seemed implausible that they were truly the densest packings, as Torquato and Jiao's simulations and theoretical analysis now suggest.

The outlier here is the tetrahedron, where the densest packing remains unknown.

Needless to say, there are many intriguing philosophical questions raised by this argument and its prominent placement in a leading scientific journal. To start, how do these arguments using computer simulations compare to other sorts of computer assisted proofs, such as the four color theorem or the more recent Kepler Conjecture? More to the point, does the physical application of these results have any bearing on the acceptability of using computer simulations in this way?

Saturday, July 25, 2009

The Honeycomb Conjecture (Cont.)

Following up my earlier post, and in line with Kenny’s perceptive comment, I wanted to raise two sorts of objections to the explanatory power of the Honeycomb Conjecture. I call them the problem of weaker alternatives and the bad company problem (in line with similar objections to neo-Fregeanism).

(i) Weaker alternatives: When a mathematical result is used to explain, there will often be a weaker mathematical result that seems to explain just as well. Often this weaker result will only contribute to the explanation if the non-mathematical assumptions are adjusted as well, but it is hard to know what is wrong with this. If this weaker alternative can be articulated, then it complicates the claim that a given mathematical explanation is the best explanation.

This is not just a vague possibility for the Honeycomb Conjecture case. As Hales relates

It was known to the Pythagoreans that only three regular polygons tile the plane: the triangle, the square, and the hexagon. Pappus states that if the same quantity of material is used for the constructions of these figures, it is the hexagon that will be able to hold more honey (Hales 2000, 448).

This suggests the following explanation of the hexagonal structure of the honeycomb:

(1) Biological constraints require that the bees tile their honeycomb with regular polygons without leaving gaps so that a given area is covered using the least perimeter.

(2) Pappus’ theorem: Any partition of the plane into regions of equal area using regular polygons has perimeter at least that of the regular hexagonal honeycomb tiling.

This theorem is much easier to prove and was known for a long time.

If this is a genuine problem, then it suggests an even weaker alternative which arguably deprives the explanation of its mathematical content:

(1) Biological constraints require that the bees tile their honeycomb with regular polygons without leaving gaps so that a given area is covered using the least perimeter.

(2’) Any honeycomb built using regular polygons has perimeter at least that of the regular hexagonal honeycomb tiling.

We could imagine supporting this claim using experiments with bees and careful measurements.

(ii) Bad company: If we accept the explanatory power of the Honeycomb Conjecture despite our uncertainty about its truth, then we should also accept the following explanation of the three-dimensional structure of the honeycomb. The honeycomb is built on the two-dimensional hexagonal pattern by placing the polyhedron given on the left of the picture both above and below the hexagon. The resulting polyhedron is called a rhombic dodecahedron.

So it seems like we can explain this by a parallel argument to the explanation of the two-dimensional case:

(1*) Biological constraints require that the bees build their honeycomb with polyhedra without leaving gaps so that a given volume is covered using the least surface area.

(2*) Claim: Any partition of a three-dimensional volume into regions of equal volume using polyhedra has surface area at least that of the rhombic dodecahedron pattern.

The problem is that claim (2*) is false. Hales points out that Toth showed that the figure on the right above is a counterexample, although “The most economical form has never been determined” (Hales 2000, 447).

This poses a serious problem to anyone who thinks that the explanatory power of the Honeycomb Conjecture is evidence for its truth. For in the closely analogous three-dimensional case, (2*) plays the same role, and yet is false.

My tentative conclusion is that both problems show that the bar should be set quite high before we either accept the explanatory power of a particular mathematical theorem or take this explanatory power to be evidence for its mathematical truth.

Thursday, July 23, 2009

What Follows From the Explanatory Power of the Honeycomb Conjecture?

Following up the intense discussion of an earlier post on Colyvan and mathematical explanation, I would like to discuss in more detail another example that has cropped up in two recent papers (Lyon and Colyvan 2008, Baker 2009). This is the Honeycomb Conjecture:

Any partition of the plane into regions of equal area has perimeter at least that of the regular hexagonal honeycomb tiling (Hales 2000, 449).

The tiling in question is just (Hales 2001, 1)

The Honeycomb Conjecture can be used to explain the way in which bees construct the honeycombs that they use to store honey. The basic idea of this explanation is that the bees which waste the minimum amount of material on the perimeters of the cells which cover a maximum surface area will be favored by natural selection. As Lyon and Colyvan put it:

Start with the question of why hive-bee honeycomb has a hexagonal structure. What needs explaining here is why the honeycomb is always divided up into hexagons and not some other polygon (such as triangles or squares), or any combination of different (concave or convex) polygons. Biologists assume that hivebees minimise the amount of wax they use to build their combs, since there is an evolutionary advantage in doing so. ... the biological part of the explanation is that those bees which minimise the amount of wax they use to build their combs tend to be selected over bees that waste energy by building combs with excessive amounts of wax. The mathematical part of the explanation then comes from what is known as the honeycomb conjecture: a hexagonal grid represents the best way to divide a surface into regions of equal area with the least total perimeter. … So the honeycomb conjecture (now the honeycomb theorem), coupled with the evolutionary part of the explanation, explains why the hive-bee divides the honeycomb up into hexagons rather than some other shape, and it is arguably our best explanation for this phenomenon (Lyon and Colyvan 2008, 228-229).

Lyon and Colyvan do not offer an account of how this conjecture explains, but we can see its explanatory power as deriving from its ability to link the biological goal of minimizing the use of wax with the mathematical feature of tiling a given surface area. It is thus very similar to Baker's periodic cicada case where the biological goal of minimizing encounters with predators and competing species is linked to the mathematical feature of being prime.

Baker uses the example to undermine Steiner’s account of mathematical explanation. For Steiner, a mathematical explanation of a physical phenomenon must become a mathematical explanation of a mathematical theorem when the physical interpretation is removed. But Baker notes that the Honeycome Conjecture wasn’t proven until 1999 and this failed to undermine the explanation of the structure of the bees’ hive (Baker 2009, 14).

So far, so good. But there are two interpretations of this case, only one of which fits with the use of this case in the service of an explanatory indispensability argument for mathematical platonism.

Scenario A: the biologists believe that the Honeycomb Conjecture is true and this is why it can appear as part of a biological explanation.

Scenario B: the biologists are uncertain if the Honeycomb Conjecture is true, but they nevertheless deploy it as part of a biological explanation.

It seems to me that advocates of explanatory indispensability arguments must settle on Scenario B. To see why, suppose that Scenario A is true. Then the truth of the Conjecture is presupposed when we give the explanation, and so the explanation cannot give us a reason to believe that the Conjecture is true. A related point concerns the evidence that the existence of the explanation is supposed to confer on the Conjecture according to Scenario B. Does anybody really think that the place of this conjecture in this explanation gave biologists or mathematicians a new reason to believe that the Conjecture is true? The worry seems even more pressing if we put the issue in terms of the existence of entities: who would conclude from the existence of this explanation that hexagons exist?

Hales, T. C. (2000). "Cannonballs and Honeycombs." Notices Amer. Math. Soc. 47: 440-449.

Hales, T. C. (2001). "The Honeycomb Conjecture." Disc. Comp. Geom. 25: 1-22.

Thursday, July 16, 2009

El Niño Has Arrived. But What is El Niño?

According to Nature the lastest El Niño has begun in the Pacific. I got interested in this meteorological phenomenon back when I was living in California and coincidentally read Mike Davis' polemic Late Victorian Holocausts: El Niño Famines and the Making of the Third World . While a bit over the top, it contains a great section on the history of large-scale meteorology including the discovery of El Niño. As I discuss in this article, El Niño is a multi-year cyclical phenomenon over the Pacific that affects sea-surface temperature and pressure from India to Argentina. What I think is so interesting about it from a philosophy of science perspective is that scientists can predict its evolution once a given cycle has formed, but a detailed causal understanding of what triggers a cycle or what ends it remains a subject of intense debate. See, for example, this page for an introduction to ths science and here for a 2002 article by Kessler which asks if El Niño is even a cycle. This case provides yet one more case where causal ignorance is overcome by sophisticated science and mathematics.

Tuesday, December 16, 2008

The Limits of Causal Explanation

Woodward's interventionist conception of causal explanation is perhaps the most expansive and well-worked out view on the market. He conceives of a causal explanation as providing information about how the explanandum would vary under appropriate possible manipulations. Among other things, this allows an explanatory role to phenomenological laws or other generalizations that support the right kind of counterfactuals, even if they do not invoke any kind of fundamental or continuous causal process.

Given the recent debates on mathematical explanation of physical phenomena, it's worth wondering if Woodward's account extends to these cases as well. In a short section in the middle of the book, he concedes that not all explanations are causal in his sense:

it has been argued that the stability of planetary orbits depends (mathematically) on the dimensionality of the space-time in which they are situated: such orbits are stable in four-dimensional space-time but would be unstable in a five-dimensional space-time ... it seems implausible to interpret such derivations as telling us what will happen under interventions on the dimensionality of space-time (p. 220).

More generally, when it is unclear how to think of the relevant feature of the explanadum as a variable, Woodward rejects the explanation as causal.

Still, some mathematical explanations will qualify as causal. This seems to be the case for Lyon and Colyvan's phase space example, but perhaps not for the Konigsberg bridges case I have sometimes appealed to. To see the problem for the bridge case, recall that the crucial theorem is

A connected graph G is Eulerian iff every vertex of G has even valence.

As the bridges form a graph like the figure, they are non-Eulerian, i.e. no circuit crosses each edge exactly once.

I would argue, though, that as with the space-time example, there is no sense in which a possible intervention would alter the bridges so that they were Eulerian. We could of course destroy some bridges, but this would be a change from one bridge system to another bridge system. It seems that to support this position, there must be clear set of essential properties of the bridge system that are not rightly conceived as variable.

Friday, November 14, 2008

Post-blogging the PSA: Gauge Freedom and Drift

It's taken me a few days to recover from the excellent PSA. I talked to many people who had a great time and who thought this year's program was exceptionally well-balanced to reflect both old classics and new debates in philosophy of science.

On the first day I was happy to attend two sessions which reflect the interpretative difficulties arising from the central role of some mathematics. In the first session, Richard Healey summarized his paper "Perfect Symmetries", followed by Hilary Greaves' and David Wallace's attempts to critically reconstruct Healey's central argument. Very roughly, Healey aims to distinguish cases where a symmetry in the models of a theory explains observed empirical symmetries in physical systems from cases where there are theoretical symmetries with no analogous explanatory power. In the latter case, the theoretical symmetries may just amount to 'mathematical fluff' or 'surplus structure' that lack physical significance.

Then it was time for some biology and the symposium "(Mis)representing Mathematical Models in Biology". The session began with biologist Joan Roughgarden's summary of different kinds of models in biology, followed by Griesemer, Bouchard and Millstein talking about different issues in their interpretation. Both Griesemer and Millstein emphasized the importance of a biologically grounded understanding of the components of a biological model, and argued that a merely mathematical definition of such components would block our understanding of biological systems. Millstein was especially emphatic (to quote from a handout from a previous presentation of hers) "Selection and drift are physical, biological phenomena; neither is a mathematical construct." That is, when we look at the changes in some biological system over time, we cannot think of the changes as resulting from a genuine process of selection with some additionally mathematically represented divergence from some ideal that we label as "drift". Instead, drift itself must be countenanced as a genuine process that makes its own positive contribution to what we observe in nature.

While it is a bit of a stretch, there is at least a suggestive analogy between these debates in physics and biology: in both cases, we have a useful and perhaps indispensable mathematically identified feature of our theories whose physical and biological status can remain in doubt, even for our best, current scientific theories. Here, it seems to me, we see some of the costs of deploying mathematics.

Sunday, October 26, 2008

Downward Causation in Fluids?

Bishop claims to have found a case of downward causation in physics based on the existence of what is known as Rayleigh-Benard convection in fluids. In the simplest case we have a fluid like water that is heated from below. What can result, as this image from Wikipedia shows, is a series of cells, known as Benard cells, where the dominant large-scale structure is fluid flowing in interlocking circular patterns.

The claim is that these patterns require new causal powers over and above what can be ascribed to the smaller scale fluid elements: "although the fluid elements are necessary to the existence and dynamics of Benard cells, they are not sufficient to determine the dynamics, nor are they sufficient to fully determine their own motions. Rather, the large-scale structure supplies a governing influence constraining the local dynamics of the fluid elements" (p. 239).

There is no doubt that this is an interesting case that should receive more scrutiny. As with McGivern's article, the tricky interpretative question is how closely we should link the workings of the mathematical model to the genuine causes operating in the system. Bishop's conclusion seems based on taking the representation of fluid elements very seriously, but I am not sure that the link between the representation and reality at this level is well enough understood. Still, I would concede his point that many features of downward causation from philosophical accounts appear in this example.

Sunday, October 12, 2008

McGivern on Multiscale Structure

Back in July I made a brief post on multiscale modeling from the perspective of recent debates on modeling and representation. So I was very happy to come across a recent excellent article by McGivern on “Reductive levels and multi-scale structure”. McGivern gives a very accessible summary of a successful representation of a system involving two time scales, and then goes on to use this to question some of the central steps in Kim’s influential argument against nonreductive physicalism.

To appreciate the central worry, we need the basics of his example. McGivern discusses the case of a damped harmonic oscillator, like a spring suspended in a fluid, where the damping is given as a constant factor of the velocity. So, instead of the simple linear harmonic oscillator

my’’ + ky = 0

we have

my’’ + cy’ + ky = 0

Now this sort of system can be solved exactly, so a multiscale analysis is not required. Still, it is required in other cases, and McGivern shows how it can lead to not only accurate representations of the evolution of the system but also genuine explanatory insight into its features. In this case, we think of the spring evolving according to two time scales, t_s and t_f, where t_f = t and t_s = εt and ε is small. Mathematical operations on the original equation then lead to

y(t) ~ exp(-t_s/2)cos(t_f)

where ~ indicates that this representation of y is an approximation (essentially because we have dropped terms that are higher-order in ε). McGivern then plots the results of this multiscale analysis against the exact analysis and shows how closely they agree.

McGivern’s argument, then, is that the t_s and the t_f components represent distinct multiscale structural properties of the oscillator, but that they are not readily identified with the “micro-based properties” championed by Kim. McGivern goes on to consider the reply that these are not genuine properties of the system, but merely products of mathematical manipulation. This seems to me to be the most serious challenge to his argument, but the important point is that we need to work through the details to see how to interpret the mathematics here. I would expect that different applications of multiscale methods would result in different implications for our metaphysics. I hope that this paper will be studied not only by the philosophy of mind community, but also by people working on modeling. If we can move both debates closer to actual scientific practice, then surely that will be a good thing!

Saturday, September 20, 2008

Traffic and Shock Waves

As explained in elementary terms here traffic can be modeled using a density function ρ(x, t) and a flux function j(ρ(x, t)), i.e. we assume that the number of cars passing through a point at a given time is a function of the density of cars at that point. Making certain continuity assumptions, we can obtain a conservation law

ρ_t + j’(ρ)ρ_x = 0

where subscripts indicate partial differentiation and j’ indicates differentiation with respect to ρ. If we make j(ρ)=4ρ(2-ρ) and start with a discontinuous initial density distribution like

1 if x <= 1
1/2 if 1< x <=3
2/3 if x > 3

Then we can show how the discontinuity persists over time and changes.

Such persisting discontinuities are called shock waves and appear as lines across which the density changes discontinuously. For example, in this figure we have lines of constant density intersecting at x = 3.

The philosophical question is "what are we to make of this representation of a given traffic system?" That is, what does the system have to be like for the representation of a shock wave to be correct? My suggestion is that we need only a thin strip around x = 3 where the density changes very quickly, i.e. so quickly that a driver crossing it would have to decelerate to zero speed. Then, on the other side of the strip, the driver experiences a dramatic drop off in density, and so can accelerate again. Still, there is something a bit strange in talking about shock waves in traffic cases where the number of objects involved is so small, as opposed to fluid cases where many more fluid particles interact across a shock wave. Here, then, I would suggest that we have a case where the mathematics works, but we are less than sure what it is representing in the world.

See this New Scientist article (and amusing video) for the claim that shock waves can be observed in actual traffic experiments (summarizing this 2008 article).

Wednesday, August 27, 2008

Explaining Clumps via Transient Simulations

Following up the previous post on Batterman and mathematical explanation, here is a case where a mathematical explanation has been offered of a physical phenomenon and the explanation depends on not taking asymptotic limits. The phenomenon in question is the "clumping" of species around a given ecological niche. This is widely observed, but conflicts with equilibrium analyses of the relevant ecological models which instead predict a single species to occupy a single niche.

As Nee & Colgrave reported in 2006 (Nature. 441(7092):417-418), Scheffer & van Nes (DOI: 10.1073/pnas.0508024103) overcame this unsatisfactory state of affairs by running simulations that examine the long-term, but still transient, behavior of the same ecological models. This successfully reproduced the clumping observed in ecological systems:

Analytical work looks at the long-term equilibria of models, whereas a simulation study allows the system to be observed as it moves towards these equilibria ... The clumps they observe are transient, and each will ultimately be thinned out to a single species. But 'ultimately' can be a very long time indeed: we now know that transient phenomena can be very long-lasting, and hence, important in ecology, and such phenomena can be studied effectively only by simulation (417).

While the distinction between analysis and simulation seems to me to be a bit exaggerated, the basic point remains: we can sometimes explain natural phenomena using mathematics only by not taking limits. Limits can lead to misrepresentations just as much as any other mathematical technique. More to the point, explanatory power can arise from examining the non-limiting behavior of the system.

Friday, August 22, 2008

Batterman on "The Explanatory Role of Mathematics in Empirical Science"

Batterman has posted a draft tackling the problem of how mathematical explanations can provide insight into physical situations. Building on his earlier work, he emphasizes cases of asymptotic explanation where a mathematical equation is transformed by taking limits of one or more quantities, e.g. to 0 or to infinity. A case that has received much discussion (see the comment by Callender in SHPMP) is the use of the “thermodynamic limit” of infinitely many particles in accounting for phase transitions. In this paper Batterman argues that “mapping” accounts of how mathematics is applied, presented by me as well as (in a different way) Bueno & Colyvan, are unable to account for the explanatory contributions that mathematics makes in this sort of case.

I would like to draw attention to two claims. First, “most idealizations in applied mathematics can and should be understood as the result of taking mathematical limits” (p. 9). Second, the explanatory power of these idealizations is not amenable to treatment by mapping accounts because the limits involve singularities: “Nontraditional idealizations [i.e. those ignored by traditional accounts] cannot provide such a promissory background because the limits involved are singular” (p. 20). Batterman has made a good start in this paper arguing for the first claim. The argument starts from the idea that we want to explain regular and recurring phenomena. But if this is our goal, then we need to represent these phenomena in terms of what their various instantiations have in common. And it is a short step from this to the conclusion that what we are doing is representing the phenomena so that it is stable under a wide variety of perturbations of irrelevant detail. We can understand the technique of taking mathematical limits, then, as a fancy way of arriving at a representation of what we are interested in.

Still, I have yet to see any account of why we should expect the limits to involve singularities. Of course, Batterman’s examples do involve singularities, but why think that this is the normal situation? As Batterman himself explains, “A singular limit is one in which the behavior as one approaches the limit is qualitatively different from the behavior one would have at the limit”. For example, with the parameter “e”, the equation ex^2 – 2x – 2 = 0 has two roots for e ≠ 0, and one root for e = 0. So, the limit as e goes to 0 is singular. But the equation x^2 – e2x – 2 = 0 has a regular limit as e goes to 0 as the number of roots remains the same. So, the question remains: why would we expect the equations that appear in our explanations to result from singular, and not regular, limits?

Batterman makes a start on an answer to this as well, but as he (I think) recognizes, it remains incomplete. His idea seems to be that singular limits lead to changes in the qualitative behavior of the system and that in many/most cases our explanation is geared at this qualitative change. Still, just because singular limits are sufficient for qualitative change it does not follow that all or even most explanations of qualitative change will involve singular limits. Nevertheless, here is an important perspective on stability analysis that I hope he will continue to work out.

Friday, August 15, 2008

Lyon & Colyvan on Phase Spaces

In their recent article “The Explanatory Power of Phase Spaces” Aidan Lyon and Mark Colyvan develop one of Malament’s early criticisms of Field’s program to provide nominalistic versions of our best scientific theories. Malament had pointed out that it was hard to see how Field’s appeal to space-time regions would help to nominalize applications of mathematics involving phase spaces. As only one point in a given phase space could be identified with the actual state of the system, some sort of modal element enters into phase space representations such as Hamiltonian dynamics where we consider non-actual paths. Lyon and Colyvan extend this point further by showing how the phase space representation allows explanations that are otherwise unavailable. They focus on the twin claims that

All galactic systems that can be modeled by the Henon-Heiles system with low energies tend to exhibit regular and predictable motion;
All galactic systems that can be modeled by the Henon-Heiles system with high energies tend to exhibit chaotic and unpredictable motion.

The mathematical explanation of these claims involves an analysis of the structure of the phase spaces of a Henon-Heiles system via Poincare maps. As the energy of such a system is increased, the structure changes and the system can be seen to become more chaotic.

For me, the central philosophical innovation of the paper is the focus on explanatory power, and the claim that even if a nominalistic theory can represent the phenomena in question, the nominalistic theory lacks the explanatory power of the mathematical theory. This is an intriguing claim which seems to me to be largely correct. Still, one would want to know what the source of the explanatory power really is. Lyon and Colyvan focus on the modal aspects of the representation, and claim that this is what would be missing from a nominalistic theory. But it seems that a Field-style theory would have similar problems handling cases of stability analysis where phase spaces are absent. For example, I have used the example of the Konigsberg bridges where the topology of the bridges renders certain sorts of paths impossible. There is of course a modal element in talking of impossible paths, but the non-actual paths are not part of the representation in the way that they appear in phase spaces. What the bridges have in common with this case is that a mathematical concept groups together an otherwise disparate collection of physical phenomena. While all these phenomena may be represented nominalistically, there is something missing from this highly disjunctive representation. I am not sure if what is lost is best characterized as explanatory power, but something is surely worse.

Three different elements come together, then, in Lyon and Colyvan’s case, and it is not clear which contribute to explanatory power: (i) non-actual trajectories in a phase space, (ii) a mathematical concept that groups together a variety of physical systems (“the Henon-Heiles system”) and (iii) stability analysis. Maybe they all make a contribution, but more examples are needed to see this.

Monday, July 28, 2008

Science and the A Priori

With some trepidation I have posted a draft of my paper "A Priori Contributions to Scientific Knowledge". The basic claim is that two kinds of a priori entitlement are needed to ground scientific knowledge. I find one kind, called "formal", in the conditions on concept possession, and so here I largely follow Peacocke. For the other kind, called "material", I draw on Friedman's work on the relative a priori.

Even those not interested in the a priori might gain something from the brief case study of the Crowe et. al. paper "A Direct Empirical Proof of the Existence of Dark Matter". What is intriguing to me about this case is that the "proof" works for a wide variety of "constitutive frameworks" or "scientific paradigms" in addition to the general theory of relativity. I would suggest that this undermines the claim that such frameworks are responsible for the meaning of the empirical claims, such as "Dark matter exists", but I would still grant them a role in the confirmation of the claims.

Update (July 10, 2009): I have removed this paper for substantial revisions.

Wednesday, July 23, 2008

Multiscale Modeling

Most discussions of modeling in the philosophy of science consider the relationship between a single model, picked out by a set of equations, and a physical system. While this is appropriate for many purposes, there are also modeling contexts in which the challenge is to relate several different kinds of models to a system. One such case can be grouped under the heading of 'multiscale modeling'. Multiscale modeling involves considering two or more models that represent a system at different scales. An intuitive example is a model that represents basic particle-to-particle interactions and a continuum model involving larger scale variables for things like pressure and temperature.

In my own work on multiscale modeling I had always assumed that the larger-scale, macro models would be more accessible, and that the challenge lay in seeing how successful macro models relate to underlying micro models. From this perspective, Batterman's work shows how certain macro models of a system can be vindicated without the need of a micro model for that system.

It seems, though, that applied mathematicians have also developed techniques for working exclusively with a micro model due to the intractable nature of some macro modeling tasks. The recently posted article by E and Vanden-Eijnden, "Some Critical Issues for the 'Equation-free' Approach to Multiscale Modeling", challenges one such technique. As developed by Kevrekidis, Gear, Hummer and others, the equation-free approach aims to model the macro evolution of a system using only the micro model and its equations:

We assume that we do not know how to write simple model equations at the right macroscopic scale for their collective, coarse grained behavior. We will argue that, in many cases, the derivation of macroscopic equations can be circumvented: by using short bursts of appropriately initialized microscopic simulation, one can effectively solve the macroscopic equations without ever writing them down, and build a direct bridge between microscopic simulation and traditional continuum numerical analysis. It is, thus, possible to enable microscopic simulators to directly perform macroscopic systems level tasks (1347).

At an intuitive level, the techniques involve using a sample of microscopic calculations to estimate the development of the system at the macroscopic level. E and Vanden-Eijnden question both the novelty of this approach and its application to simple sorts of problems. One challenge is that the restriction to the micro level may not be any more tractable than a brute force numerical solution to the original macro level problem.