Category Archives: Uncategorized

Default

Notes about the meetup about libraries media & semantic web tonight hosting BBC

Loose notes of the meetup in the BBC about #semanticweb, London, March 28, 2012

Media powerhouses, libraries, archives and museums are agreeing on standards to link data. Some of their representatives spoke at the meetup in London. There were more than 100 attendees. Ten per cent of them worked at the host, the BBC.

Going from Tables to Graphs, finding link data connections and going to a linked Web of Data @jonvoss @Historypin #londonsemweb #lodlam

Ade Stevenson – Mimas – @adrianstevenson

dirty data, URI persistence

data modelling can be hard

steep learning curve

complexity

how sustainable are the data sources?

Evan Sandhaus – The New York Times – @kansandhaus

Why we need rNews markups

It does no make sense to keep structured data in CMS databases

that is lost in html when published on the Web

Benefits rNews if well adopted by the publishing community

Superior Algorithmically Generated Links

Superior Tool Support

Better Analytics

Dan Brickley – Schema.org – @danbri

Works for Google on Schema.org project

Paul Otlet “the man who dreamed the Internet”

Lonclass:

Compositional Semantics

WebSchemas group

Example: Google Rich Snippets

Get the markup on the page

Markup is not pretty, eg. on IMDB with entities and relationships in code

Classes (types)

Property (attributes)

Silver Oliver – BBC – @silveroliver

Andy Wilson – BBC Academy -@andywilson460

Andy is Head of Centre of Technology at BBC Academy

Google Refine to get RDF

Media powerhouses, libraries, archives and museums are agreeing on standards to link data. Some of their representatives spoke at the meetup in London. There were more than 100 attendees. Ten per cent of them worked at the host, the BBC.

Going from Tables to Graphs, finding link data connections and going to a linked Web of Data @jonvoss @Historypin #londonsemweb #lodlam

Ade Stevenson – Mimas – @adrianstevenson

dirty data, URI persistence

data modelling can be hard

steep learning curve

complexity

how sustainable are the data sources?

Evan Sandhaus – The New York Times – @kansandhaus

Why we need rNews markups

It does no make sense to keep structured data in CMS databases

that is lost in html when published on the Web

Benefits rNews if well adopted by the publishing community

Superior Algorithmically Generated Links

Superior Tool Support

Better Analytics

Dan Brickley – Schema.org – @danbri

Works for Google on Schema.org project

Paul Otlet “the man who dreamed the Internet”

Lonclass:

Compositional Semantics

WebSchemas group

Example: Google Rich Snippets

Get the markup on the page

Markup is not pretty, eg. on IMDB with entities and relationships in code

Classes (types)

Property (attributes)

Silver Oliver – BBC – @silveroliver

Andy Wilson – BBC Academy -@andywilson460

Andy is Head of Centre of Technology at BBC Academy

Google Refine to get RDF

Patented genomics

The patent system is being used to monopolise the all the tools for genomics manipulation.

In the US, for example, it costs a woman between $3,000 and $4,000 to be tested for familial breast cancer that is free in Europe to all women. This is because a corporation owns -in the US only- the patent for the two genes involved.

 

Flowering bamboos, rats outbrakes and changes of political regimes

Mizoram rats eat crops
Source: National Geographic

Every 40 to 50 years, the bamboo Melocanna baccifera blooms in Mizoram, an Indian state between Bangladesh and Myanmar Burma. The bamboo covers one third of the state of the folded and hilly region. The rizomes hold together the soil of the slopes of the hills.

view of Mizoram from the air
Source: Google Earth

The stems, leaves, buds, fruits and seeds of the bamboo are crucial to the livelihood of most of the inhabitants of the country. The ripening of the fruits of the bamboo starts an ecological cascade of events that have dire consequences for the local population. The last one in the twentieth century was in 1959 to 1961. That bloom and the precedent ones in the historical record spurred a plague of rats feeding off the bamboo’s fruit. The population of stinkbug (a hemipterous known as thangnang in Mizo) is also known to explode in size. The rats eat up off the crops before the harvest, sending the population into misery and famine.

Ecology of the black rat in the bamboo forests

The black rat usually represents only 10% of all the rats of all species in the bamboo forests in normal years. In Mautam years, the black rats make up more than 90% of the total culled rodents. The success of the black rat is due to its larger litters and shorter pregnancy. The larger litters are controlled by mothers eating newly born rats in normal years. Their pregnancies are 5 days shorter than other species. This makes this species profit the most from the abundance of seeds in Mautam years.

The same female rat may lay one litter every month during the Mautam. Most female rats reproduce at the same time. As a consequence, the population grows in pulses. From 50, they grow to 200, then 600, etc. Eventually the seeds of bamboo ran out and the rats invade the nearby fields. If the crop is not yet harvested, the rats will eat all the grains of rice and maize.

The gregarious synchronisation of the flowering of bamboo has a period of approximately 40-50 years depending on the species. It marks the death of the plants in thousands of square kilometres around and the sprout of the next generation. The flowering is controlled by the genome; if the bamboo is cut before it flowers, if will do so immediately upon its regrowth.

This cycle is longer than the emergence of the cicadas in North West America every 17 years. Phyllostachys bambusoides flowers every 130 years in China. Which type of parasite or predator might have tuned the biological clock of the bamboo? And why is the rat and another insect the only ones who benefit from this cyclical bloom?

The bamboo flowering changed the regime of Mizoram

The Mautam outbreaks shape up the politics of the state too. The growth of the number of rats captured before 1959 made the Mizo people warn of the imminent Mautam. These calls were dismissed as folk superstition by the Indian Government of Assam, the state whose one of its districts were Mizoram. The officials failed to prepare for the famine that followed. This lead to the foundation of the Mizo National Famine Front to provide relief to the famine. It later became the Mizo National Front (MNF), which staged a major uprising in 1966. MNF fought a separatist war against the Indian Army. In 1986 Mizoram was granted autonomy as a separate state from India and Assam.

The influence of ecology in our civilization is still largely unknown at best, underestimated or overlooked more often than not. Some scientists of the Earth Institute at Columbia University studied 175 countries and 234 conflicts that killed more than 25 people killed in a given year. They found a strong correlation between the occurrence of el Niño events with upsprings and civil wars in countries. From 1950 to 2004 the chance of civil war breaking out was about 3 percent during La Niña; during El Niño, the chance doubled, to 6 percent. Countries not affected by the El Niño-Southern Oscillation remained at 2 percent no matter what. Overall, the scientists calculated that El Niño may have played a role in 21 percent of civil wars worldwide—and nearly 30 percent in those countries affected by El Niño.

How to prevent famine and wars from the Mautam

Areas only 5 to 10 km away can be affected very differently be the pest, either by different progress of the flowering waves of the bamboos, agricultural practices or other factors. There are around 30 bamboo species growing in the Bengal Bay region. Their ecology is still largely unknown.

It is unclear yet how to effectively resist the rodents floods. The locals slash and burn vegetation to grow vegetables, a practice known as jhum cultivation. The government of Mizoram has launched a policy to end Jhum cultivation.

If you want to see some of the action of the last Mautam watch Rat Attack! The documentary has the typically freak title of all the productions of National Geographic. Shot in of 2009, the video is a chronicle of the research of Ken Applin, a rodent expert, in Mizoram in 2007.

Is the Wisdom of Crowds just a matter of Physics?

I took a few notes when reading the book “The Wisdom of Crowds” that I post here.

James Surowiecki’s book is about how smarter are groups than the individuals than compose them separately, under certain conditions that the author enunciates. For instance, a crowd can calcutate the weight of an ox, the outcome of a general election or of a drug test more accurately than the best experts, consistently over time.

The production of superior average judgement of a number of people regardless of the best ability, engagement and information of the individuals, or a small subset of them, composing the group. In other words, the averaging the answers of the individuals of larger groups result is answers of counts or any quantity more accurately and precisely than their brightest of its individual components.

This wisdom works particularly well for simultaneous, well-defined problems and with a limited and pre-determined set of solutions determined like polls or contests. The book comes short of proposing new practical applications of the wisdom. Which practical problems can be solved with this newly found Wisdom of Crowds? The author claims that “the implications [of the wisdom of crows] for the future are immense”. He suggests that security intelligence would benefit from probability and decision markets.

Physics as a Social Science

I read The Wisdom of Crowds after Philip Ball’s Critical Mass book. Critical Mass describes how physics is helping understand some phenomena of social science. I enjoyed reading both immensely but I feel that what I learned from Critical Mass influenced the way I feel about the Wisdom of Crowds. I miss in this one the approach of asking and researching the why, not only the how, of the collective phenomena described on the book. I also would have enjoyed a pulse to cross over the science of sociologists and social psychologists with the ones of mathematicians, ecologists and statisticians.

Crowds solve only a few types of problems, and some better than others

  1. cognition: weight of an ox or where to build a public swimming pool
  2. coordination: stock markets, traffic, organization in companies. Coordination is well harnessed by companies like Zara, which is basically a logistics-centric organization.
  3. cooperation: paying taxes, fighting pollution, democracy. Cooperation looks like the type of problems that crowds usually fail at from the corruption of the Italian football league, measurement of TV audiences by Nielsen

Conditions of the individuals that make groups intelligent

  • diversity
  • independence
  • private judgement: information, analysis or intuition
  • decentralization, eg. scientific efforts to fight SARS epidemia

Diversity and independence are important because the best collective decisions are the product of disagreement and contest, not consensus or compromise.

Factors that affect negatively decision making by groups

Sequential decision making. The best technology will not necessarily win in a market-driven selection process. It is a wasteful process

Drift towards consensus over dissent

Verdict-driven juries over evidence-based juries

Armstrong’s seer-sucker theory: “Not matter how much evidence exists that seers do not exist, suckers will pay for the existence of seers.”

New information by a few is ignored, misinterpreted or modified to fit old messages

Group polarization, sequence or status deference. Talkative people are not necessarily well liked but they tend to be influential. Very few human beings perform consistently well in an environment of negative reinforcement

Summary of the ideas of the book

  • The Difference Difference Makes:
    • Waggle Dances: send out as many scout bees as possible when the alternatives are unknown. Examples of gasoline-powered car reaching mass production status over steam-powered cars.
    • The Value of Diversity: Expertise beyond a minimal level is of little value in forecasting change
  • Monkey See, Monkey Do: Imitation, Information Cascades and Independence
  • Putting The Pieces Together: The CIA, Linux and the Art of Decentralization?
    • Shall We Dance? Coordination in a Complex World. Imperfect markets composed by irrational people can still produce near-ideal results
  • Society Does Exist: Taxes, Tipping, Television and Trust. Willingness to punish bad behaviour even when you get no personal material benefit from doing so. Be nice, forgiving and retaliatory if you expect successful cooperation.
  • Traffic: What We Have Here Is a Failure To Communicate. Congestion charging leaves the decision to drive or not in the hands of the individual.
  • Science: Collaboration, Competition and Reputation
  • Committees, Juries and Teams: The Columbia Disaster and How Small Groups Can Be Made To Work. A successful face-to-face group is collectively intelligent, it makes everyone work harder and think smarter: the intellectual swing.
  • The Company: Meet The New Boss, Same as the Old Boss? Decentralization allows to make decision and become engaged. It also makes coordination easier. Irrational people can add to collective rationality.
  • Markets: Beauty Contests, Bowling Alleys and Stock Prices? Shorting stocks is riskier than buying them. Investors are concerned nut just with what the average investor thinks but with what the average investor thinks the average investor thinks. A crash is the inverse of a bubble, although more sudden. We do not know why crashes occur or why bubbles start.
  • Democracy: Dreams of the Common Good?: a healthy democracy inculcates the virtues of compromise – which is, after all, the foundation of the social cotntract


Some of the things the book made me think about

The rule by a technocratic elite fails because of small-group dynamics, groupthink and lack of diversity. Groupthink works not so much by censoring dissent as by making it seem somehow improbable.

The judicial system is a particular case of non-elected, insulated elites. Journalists employed by media powerhouses in representative democracies. Scientists and mathematicians also suffer from similar problems but this is partially mitigated by a few factors. Incentives and reputation rewards are usually available to those scientists turn curiosity into results and challenge established views. Reputation is not the only transaction of the scientific market: eventually the value of new ideas and empirical evidence is also part of the transaction. Part of the job of scientists, at least nominally, consists in verifying the rigour and robustness of the work of other scientists before building further work upon it.

My questions about some points of the book

Surowiecki is more interested in the how than in the why. In this regard, his book reads like a descriptive essay. I miss a bit more reflection into why groups work well solving some problems and not others. The author simply mentions that “this is the way the world works”. It is funny how smart authors have different approaches to their ideas. Philip Ball’s Critical Mass book on one hand goes, in my opinion, too far in using Hobbes’ as a base to explain many ideas, by consistency or contrast. Surowiecki is not less profound in its ideas, but it touches many of them without historical references.

Is the efficiency of groups at solving numerical problems yet another evidence of phenomena involving systems of minimum energy? The experiments of Epstein with inter-dependent agents prove that agents are lazy; they want to do as little thinking as possible. I wonder if independent agents share with dependent ones the spontaneous conformity to whatever is minimal energy, which also happens to be, on average, the “right” strategy, solution or state of a physical system?

Why is the arithmetic average the best metric to assess the wisdom of crowds? Is the distribution of the answers normal? Are the mode or median even better estimates? This question reminds me of my excitement about getting the book “The Flaw of Averages: Why We Underestimate Risk in the Face of Uncertainty” by Sam L. Savage in a few days time.

Wisdom might not be the right term for the phenomenon of crowds being accurate and precise. Wikipedia defines the term as “Wisdom is a deep understanding and realizing of people, things, events or situations, resulting in the ability to choose or act to consistently produce the optimum results with a minimum of time and energy. …” Can the phenomenon be defined as “deep”? The crowds seem to be good at solving some quantitative problems, but can we assuming that that is deep understanding or is it simply the law of mechanical physics at work?