Mark Bernstein: Designing A Conference: Details

Designing A Conference: Details

Some details might help Tinderbox novices follow “Designing A Conference With Tinderbox.” If you haven’t read that post already, you should probably read it first.

Prototypes For Papers

I seldom build a tree of prototypes in Tinderbox, but this task is an exception.

We have a prototype Paper that represents a submitted research paper. It has the key attributes you’d expect: author, title, submission number, reviewer scores, and the email address of the corresponding author.

We then have a bunch of prototypes that inherit from Paper and represent various categories of accepted paper. These prototypes do two things:

Add some distinctive appearance, so it’s easy to scan a complex map and pick out the accepted papers.
Provide an easy handle for agents that want to search for all the accepted papers, or for specific kinds of accepted paper.

Agents For Double Checking

Web Science ’13, like many conferences, uses the EasyChair Web application to coordinate reviewers. EasyChair is a headache, but perhaps less of a headache than the old days where we photocopies every paper four times, stuffed and mailed a hundred envelopes, and collated reviews on paper tally lists.

EasyChair gives you a convenient count of the number of acceptances you’ve sent. Obviously, we want to be confident that our own records and EasyChair’s are in sync. One way to increase our confidence is to check that the number of accepted papers in each category matches the number of EasyChair acceptances. An agent can make short work of this.

We simply look for all the notes that have the appropriate prototype, count them up, and check the total against the number of acceptance emails. If they don’t match, you know you’d better go hunt down the discrepancy!

Even then, you never know. One author of an accepted paper didn’t read his email with sufficient care, and assumed his paper had been accepted for a workshop. He came, read his paper to the workshop, and left Paris. Only as he used the train’s wifi on the return voyage did he realize his mistake. That was awkward, but not as awkward as the encounter would be with a researcher who has travelled a long distance at great expense, only to find their presentation is not on the program.

Trial By Fire

Using the prototype Tinderbox Six to manage the program was a risk. Through the process, the software changed from day to day, and it was not unusual for progress on the conference to require a quick fix to the software. Many details of the screen layout in these examples will therefore look a bit strange to today’s Tinderbox 5.12 users, and they’ll doubtless seem quaint when Tinderbox Six is actually released.

I also kept my conference notes in Tinderbox Six, such as they were — program chairs have many distractions, I was on stage a lot and it’s hard to take notes when you’re on stage.

As usual, the left margin is reserved for notes to myself — especially notes about #Tinderbox features that I wanted as I made the notes. For example, the test version I was using didn’t understand that double-clicking an adornment should create a new note, just like double-clicking in the background of the map. This is the sort of thing that Test Driven Development doesn’t catch (it’s a dog that didn’t bark in the night) but that is still very good to know about. Tinderbox Six was well behaved through the conference, giving us a little more confidence as we approach the widening of the circle.

Questions? Email me.

on this date |

Designing A Conference With Tinderbox

Designing A Conference With Tinderbox

At the close of Web Science 13, conference chair Prof. Hugh Davis said some very kind things about the construction of the program. During the final deliberations, he was in Southampton and I was in San Francisco. When we made the final decisions late in the Southampton evening. it seemed we had a big bundle of ill-assorted papers. When Hugh awoke the next morning, all the papers were neatly sorted into sessions and assembled into a draft program.

Though Tinderbox isn’t designed for this task, it turns out that Tinderbox does it quite well. I’d like to walk through it in some detail — perhaps too much detail — because the task is itself not uncommon or unimportant, and because lots of other scheduling have similar properties.

The Nature Of The Problem

You never have enough time to plan the program for a peer-reviewed conference.

On the one hand, the deadlines for submitting papers and for submitting peer reviews need to be as late as they possibly can be. You want the latest results at the conference and the best results of the moment – not the best of last year. These days, researchers tend to submit late, and reviewers are even less punctual. While it’s possible to take a firm line with authors, reviewers have the upper hand and they know it; they’re important and busy people and you need their reviews more than they need you.

But once the reviews are in, there’s lots of pressure from the other end to have a final program. The Proceedings Chair wants the list of papers, yesterday. The Publicity Chair needs a program to publicize. The Powers That Be are always very edgy at this moment; they’ve committed to almost all the conference expenditures at this point and they’re terrified that no one will come. It never fails: the grizzled and ultra-competent Professor who has done this dozens of times before will, inevitably, wake up at this point in a cold sweat and email everyone to demand a finished program right away.

What You Can Do In Advance

In the nature of things, the Venue and the Powers That Be will dictate the shape and duration of your conference. For this conference, these constraints included:

A tradition of a single track, without parallel sessions
Three days (plus workshops, which Claudia Roda handled so adeptly)
Two fixed keynotes, shared with other conferences, that cannot be moved
Our own keynote and a plenary panel, also fixed by the speakers’ other commitments
We’re not buying lunch, and this is France; we need to allow plenty of time for a lunch break.
The opening and closing times of the auditorium (typically constrained by local regulations or work rules)

Now, other conferences have different constraints, and some of these constraints might be finessed. Computer Science conferences, for example, never have evening sessions. Biochemistry conferences do, and that’s an arrow for our quiver if we need one.

All these constraints can be hard to keep in mind, but it’s easy to write them down in the form of a quickly sketched schedule.

Each box is a Tinderbox adornment. Each label is an adornment, too. This doesn’t need to be precise or drawn to scale: it’s just a sketch. (Of course, it’s much larger and easier to read on your screen).

Looking at this, we can see that we have eight sessions to plan. Each session runs 90 minutes and so can accommodate 3 long papers, 6 short papers, or some combination of long and short papers. So, we could possibly accept as many as 48 short papers.

Special Events

Web Science always has a poster session as part of the main program. It’s unusually strong, featuring good work from senior researchers. It’s tough to get experienced people to do posters, which in other conferences are dominated by student work, so we need to give posters a large and prominent slot. But we already have two keynotes on day 1! We’ll put the posters on Day 2, and give them 2 hours. But then the coffee break — which is again fixed by the venue contract — falls at the end of the posters. So, we’ll move lunch a little earlier, split the morning session in half, and now the coffee break falls conveniently at the midpoint of the poster session. We’ve got an odd space at the end of the day, but I’ve got some ideas for a setting up an invited panel anyway in the name of program balance.

So now we have six sessions of research papers. I’m not happy that three of them fall on Day 3, but decide that can’t be helped.

Pecha Kucha

We’re still well in advance of making program decisions at this point. Reviewers are reading and pondering their assignments. We’ve got a lot of submissions — 198 — and lots of interesting topics. My own impression, though, is that we don’t have many papers that stand head and shoulders above the rest. Making decisions will be difficult.

In addition, I’ve been worried for years about the quality of presentations at research conferences.

Cons, or Why We Are Unhappy At Conferences

I work at my talks, but I have to: I have the legacy of a speech impediment and the handicap of choosing topics that are usually unfamiliar. Lots of researchers are not especially talented presenters, but there’s no reason to expect they would be. You wouldn’t expect researchers to be especially good singers or right fielders, either.

Pecha kucha talks (about which I’ll write more later) are usually considered a risk-averse programming technique, a way of minimizing the damage one lousy presentation can do. That’s not my concern here; we’ve got the whole arsenal of peer review to cover that. But the discipline of 20 slides, changing every 20 seconds, helps bring out the strengths and hide the weaknesses of academic presenters. No one uses enough slides: here, the format insists on it. Too many presenters forget to speak up; here, they’ve got the adrenaline rush of summing up years of research in 400 seconds. Students, especially, tend to get lost in a forest of detail, but with only 20 seconds per slide, they’re constantly reminded of the need to explain the big picture.

I want to try this. I sense that other people on the committee aren’t exactly enthusiastic about the idea, but sitting in the chair has some perks. We drop that into Day 1, session 2. That gives us 11 pecha kucha talks, and I make a mental note to ensure that some really good papers and reliable presenters are among them

When discussing which papers to accept, I make a point of asking whether an accepted paper might be suitable for the pecha kucha session. By the time we’re done, we’ve filled the pecha kucha roster. (In the end, one of the best-paper winners and two runners up came from the pecha kucha session.)

The Talks

The peer review process identifies acceptable talks. Every paper is read by at least three reviewers. I try to mix expertise and disciplines in assigning reviews, so we often have very different people discussing the same paper. Difficult or contentious papers get additional reviewers. Some have five or six. The goal is to accept every paper that is acceptable, but none that are not.

In addition, we have some tight constraints. Wall space limits us to 45 posters. Our five sessions can fit 15 long papers or 30 short papers. We’ve already lined up the 11 pecha kucha papers.

Posters and Presentations

Lots of conferences use posters as a training ground, but at Web Science we want them to be a first rate venue. Some papers lend themselves to posters.

One strong message
Topics with clear appeal to everyone
Topics with specialized appeal and clear importance
Implemented systems
Controversial methodology

The last point bears some elaboration. Occasionally, conferences receive papers that are difficult to evaluate because they are methodologically unorthodox. Reviewers are not confident that the results are wrong, but strong doubts are expressed. Discussion will improve the underlying work, but how can you arrange for that discussion? Referee reports may not be enough, especially not if the author simple assumes the the reviewer is hostile or has failed to understand their work. A paper presentation might not work, either, because even a carefully prepared question might get bogged down in details in which most of the audience isn’t interested. Posters are perfect for this; you can meet with people and establish that (a) you’re a reasonable fellow, (b) you understand their work, but (c) they could be more convincing if only they addressed some objections.

Conversely, some topics lend themselves to presentations.

Lots of messages, each requiring separate discussion
Epistemology, ethics, literature, and other fields without strong visual language
Arguments that require tearing down common assumptions
Arguments that confirm or extend received wisdom

Lots of people will bypass a poster titled “Dogs bite!” assuming that it is student work, confirming what everyone already knows. Sometimes, this sort of research cleverly demonstrates what everyone knew but nobody could actually prove. Sometimes, we demonstrate what everyone thought they knew, but could not really have known before our new experiment. This rhetoric is more effective in a dramatic presentation than in a poster; the poster has to disclose the punchline at the outset where the presentation can build up to it properly.

So, at the end of the day we have about 30 papers destined to be posters. Now we start to build up some sessions.

Building Sessions

The hard work of pulling together 700 reviews of nearly 200 papers led to a very complicated workspace that I used during program committee meetings. Every review was read, every paper examined, and most papers were discussed in some detail. In the end, we had a list of papers that were clearly acceptable, papers that clearly needed more work or that would find a better audience at a different conference, and perhaps a dozen papers on the bubble. It was time too build some sessions.

I had to start somewhere. I picked up Harry Halpin’s “Does The Web Extend The Mind?” It’s got to be a presentation — it’s a philosophy paper, it’s dense, there’s no obvious visual hook. Halpin’s got a panel on Day 3, and I’m not sure this paper is ideal in the leadoff spot on Day 1. So it’s the opening act on Day 2. We don’t have any other papers on the same topic, but we’ve got two papers that harmonize nicely with its psychological concerns.

The mechanics of this are really easy. I make an alias of the paper’s note from the program committee workspace and then paste the alias onto the program adornment. The original note carries metadata like paper number and author email addresses, so those are carried along with the alias. We’ve only got an hour in this split session, so we pencil in one long and two short paper sessions, and we give the session a title.

Going back to the pool of accepted papers, I notice a study about people’s attitudes toward user-contributed reviews. This concerns ownership of crowd-sourced material, and Cory Doctorow, slated to close Day 1, is a renowned intellectual property activist. So, it would be nice if this paper were on Day 1, but not so close to the keynote that it steps on its toes. But this one can bat leadoff – it’s classic Web Science material. Again, there’s nothing else much like it but we have lots of papers about user-contributed material and also lots of papers about crowdsourcing. It’s easy to imagine a session.

Most of the other sessions are equally easy to assemble. The session on Journalism and the News assembles itself. Another collects interesting papers about affinity, ranging from financial sentiment on Twitter and general concepts of privacy to gender in Facebook profiles. The remaining papers break down fairly neatly into those chiefly interested in networks and those concerned with representing data (or people). Suddenly, we’re done.

Cleaning Up

Much of this could be done in a graphics package like Visio, or OmniGraffle. But in Tinderbox, because each of those notes already has the title, author list, and lots more metadata, it was easy to write a quick export template to format the draft program, including the pecha kucha session and the list of accepted posters. All this went straight into Pages (I might easily have used Scrivener) where I fixed the formatting and made sure everything was right.

It was also easy to write agents to do simple checks. How many papers and posters were in the program? How did they compare to the number of acceptance emails we had sent? At this point, I noticed that the program showed we had accepted one poster too many. Were we listing as accepted a poster we had actually rejected? It turned out to be a clerical blunder — I’d made two aliases of one poster and hadn’t noticed the spare. But Tinderbox made it easy to check the number of papers and posters in the program against the number of acceptances we had sent out. It would be awkward to have rejected a paper and then have the researcher show up at the conference empty-handed, only to be asked to do a presentation. Double-entry accounting is your friend.

So, a few hours after the final decisions were made, we had a nice draft program ready for discussion. Additional changes would be made, but the bones of the program were all in place. When changes were needed, moreover, it was easy to move papers in the Tinderbox map and see the impact on the program.

on this date |

Too Much Philosophy?

The program at Web Science 2013 was diverse. For example, here’s the roster for the pecha kucha session:

Who Wants To Get Fired?
Experiences Surveying the Crowd ◀ best paper award ◀
Why Individuals Seek Diverse Opinions (or Why They Don’t)
Considering People with Disabilities as Überusers for Eliciting Generalisable Coping Strategies on the Web
Voice-Based Web Access in Rural Africa
Rethinking Measurements Of Social Media Use By Charities: A Mixed Methods Approach
A comparison between online and offline prayer
The Performativity of Data: Re-conceptualizing the Web of data
Debanalizing Twitter: the transformation of an object of study
Why Forums? An empirical analysis into the facilitating factors of carding forums
Toward Google Borders

From technical solutions to impetuous twittering to methodological questions in using Amazon Mechanical Turk to the nature of online prayer, we’re covering a lot of ground.

In the end, we have no choice. There are plenty of people who study some facet of the Web. Web Science studies the Web as an entire phenomenon. It’s not just the plumbing and it’s not just the sociology and its not just philosophy. Web Science it the place where philosophy informs the plumbing.

This makes for nifty sessions — you’ve got to love the transition between papers 5, 6, and 7 — but it also creates real tensions. A paper on the nature of trust, for example, simply cannot be correct in the way that a paper on information retrieval can. Then again, lots of people will be able to follow at least part of a paper about trust, but if you’ve forgotten what an eigenvalue is or why the intentional fallacy is false, it’s not hard to get lost in a paper whose author considers the argument straightforward.

In one of my first talks, I got a Nobel laureate completely confused about the elements of my experiment. That was a useful lesson: everyone has a hard time with hard ideas. You’ve spent months or years alone with your problem in a dark room, but your audience hasn’t met it before. Take it easy; they won’t be bored with a few minutes review and they won’t think you’re dim.

One significant problem at Web Science right now is a failure of imagination: how do our small studies suggest great consequences? This is not to say that writers should claim too much or write incautiously. But consequences that might rock your own province can strike people from other fields as obscure, and can seem pedantic or worse to people who have work to do.

Web Science is still not very good at working with people who build Web sites and invent Web apps, the very people we ought to be serving and to whom we ought to be listening. For that, we need every eigenvalue, every statistic, and every construct in our toolbox.

on this date |

WebSci

Choosing The Best

Eight years ago, I wrote a post about a colleague’s protest about the lack of women in a conference program. Since then, the post has been sitting in the penalty box — the place where volatile posts go to cool off. (Regular readers may be astonished that I possess such a thing.)

Of course, that conference has been over for years. I think, though, that there’s a useful idea here, one that casts some light on what I call the Treaty For Web Science, about which I hope to write soon. So I’ve rewritten and extended the post here.

2005: My friend had written that

There is no such thing as selection from strict quality criteria and nothing else.

Here, I think we've wandered into the swamp or stepped off the end of the pier. If there's no such thing, for example, as selecting from strict academic quality, then universities are just social clubs where some lucky people get to distribute lots of money to their attractive and well-connected friends. That can't be right.

One could, I think, assemble a technical conference program from purely objective criteria that would likely correlate with "quality". We might need to fine-tune our metrics; that's why this is hard. It doesn't mean it can't be done.

Is it possible to select the best baseball player ever, selecting strictly from on-field performance and nothing else? I think so. Can we ask, "Was Babe Ruth a better player than Willie Mays?" We can, and the answer is yes -- even though most people seem to like Mays and lots of people thought Ruth was a jerk. (Update: Eight years later, a more effective comparison would be Barry Bonds and Mays. Or load the deck even more: Barry Bonds or Jackie Robinson. Jackie’s number 42 has been retired from baseball and Bonds might never get into the Hall, but no one is going to argue that Bonds wasn’t a better player.)

Is it possible to select the best 5 novels of the year, arguing strictly from literary quality and nothing else? Most people think this is a plausible enterprise, though it's bound to be difficult. The National Book Award, the Booker, the Pulitzer – they'd mean nothing if people thought they were rigged or jobbed or arbitrary.

As it happens, the last National Book Award (i.e. 2004 or 2005) ended up short-listing five novels. All five were written by women. All were "small" novels. None sold very well. A number of other novelists (Philip Roth, Tom Wolfe) wrote books that were eligible, but weren't nominated. If we neglect the questions raised by Middlesex, we'd expect that all five books on the short list would be written by people of one gender or the other about once every fifteen years, just from luck. It's possible that Roth's maleness worked against him, it's possible that judges thought he was already sufficiently famous, or that having already won the prize, he didn't need another shiny object. It's possible that the judges simply liked the other books more.

Writing in The Believer, National Book Award chairman Rick Moody – no slouch of a writer – said that's just the way it turned out. Moody thinks the resentment is, at core, anti-intellectual: famous writers should create the best books, right? He's got a nice polemic on how anti-intellectual spleen has no place in the National Book Award, and how the media furor surrounding the award infantilizes the American book-loving public.

2013: What I didn’t appreciate sufficiently in 2005 is the way this disagreement illuminates a disciplinary boundary. My friend is a humanist steeped in postmodern thought. My background lies in the physical sciences. We seem to be arguing politics, but we’re really arguing disciplinary faith.

My friend’s position, I think, is that all these judgments are necessarily embedded in social contexts and understandings. We can’t truly know which novel was the best of 2013; it’s not really a question that makes any sense. The best we can hope to do is suggest which novel would be the best one for you to read right now. Someone else, at some other time, might find it dull or trite or impenetrable. And if we can’t choose the best novel, how can we choose the very best conference speaker? And might not being female sometimes in itself make one person a more effective speaker than another?

Suppose you’re having a dinner party. Seven guests have been invited; your table can manage eight. Is there one best person to invite? Context is everything here, and it’s entirely possible that balancing genders, personalities, and interests will lead to the best answer.

But science cannot work this way. As Curie said, in science we talk about things, not people. Considering a talk at a scientific conference, we can easily ask (and, one hopes, answer) questions that would confound us in literature:

Is the work completely correct?
Is it completely original?
Does it suggest exciting new avenues for research?
Might it have important practical consequences?

No one can read a new novel and tell you with confidence whether it’s going to inspire lots of novels or not. For plenty of computer science papers, on the other hand, this is immediately apparent. In literature, it might be interesting to hear someone with talent expound a position that’s almost certainly wrong: Edmund Wilson’s case against The Lord Of The Ring, or Jane Smiley’s rejection of Huckleberry Finn. This is even more true in History, which thrives on energetic defenses of such seemingly-indefensible positions as “our sympathies should lie with Sparta, not Athens” or “it might have been better for everyone if Britain had let Germany win WW1.” Even if it turns out that the new argument doesn’t quite hold up, the attempt may well repay some time and effort by giving us a broader understanding and deeper sympathy.

But in science, wrong is wrong. And few things would be more wrong than preferring paper A to paper B because the author of A, though he’s clearly made a blunder this time, is an important fellow while the author of B is an unknown student from a backwater. To take the speaker’s podium away from B and give it to A would, in the sciences, be a revolting crime and a scandal. It’s unthinkable.

A fairly precise parallel can be found in the Anglo-American legal tradition. Suppose Smith, a beloved movie star, has committed a serious crime. He is immensely wealthy. He is head of prominent charities and is considering running for office. Thousands of workers depend on him and would lose their jobs if he weren’t available to make his next film. May we excuse the crime? The Romans would have answered without hesitation, “yes.” But the Anglo-American tradition is unambiguous: though the sky fall, let justice be done.

Now, even in the sciences we may have tough decisions. We might not catch a mistake. We might not know that something has already been published, especially if the first publication was obscure or if it used a different notation. Reasonable people can disagree over whether a given result is intriguing or rather dull. Committees can err. But, obviously, they must not commit crimes.

Now, scientists are not (always) dim or parochial. They understand that people are fallible, and they understand that in other fields to ask for a judgment of whether a conclusion is wrong is to ask too much. It’s impossible to apply the standards of physical chemistry to a paper on ethics or narratology. But to consider persons, not facts, when choosing conference papers is going to make scientists very, very uncomfortable.

on this date |

WebSci

Preamble

Websci this year received a lot of work.
One ninety eight submissions were reviewed
By more than seventy program committee folk,
And in the program we managed to find space 
For forty talks and more than forty posters
By squeezing every minute,every meter,
And plotting out the pecha kucha show
I hope you’ll all enjoy right after lunch today.

I emphasize we always separate
The mode of presentation from publication.
Some of the papers we thought best became
Posters or short talks because we we thought
They’d show to best advantage in a smaller space.

So we might think ourselves well pleased,
A happy conference, prosperous and strong.

This year, you gave this conference many frights.

We waited for your papers anxiously
And feared too few would come, 'til at the end
They all poured in at once, and more came late.
The deadline for extended abstracts came
And went, and papers still rushed in. Reviews
Were also plentiful but very slow,
Terser and more shallow than I’d wish.

And so I take a moment here to ask
You all to slow down all of your reviews, 
To water them and let them grow a bit.
Move carefully but well beyond your comfort zone,
And show your work. Tell what you understand
And how. 

	We do not care as much as you 
Just what you like — and don’t. We need to know
More clearly what you thought about, and why. 

❧ 

Disciplinarity is harder than you think.
In school it seems to be for most
A question of departmental boundaries,
One that good-natured friends with ease
Should overcome. Alas, this turns out not
To be the case. Our disciplines
Encode our rules of evidence, and worse
Encode what we think good, and bad, and wrong.
Last night, in fact, I was awake past two
To settle one last vexing argument.
Simple things like how we submit work
And then review it raise question and tempers too.
What seems straightforward in one field
Another finds intolerably wrong.

❧  

Whoever thinks a faultless piece to see,
Thinks what ne'er was, nor is, nor e'er shall be.					

But still, my friends, this isn’t good enough.
The writing we received is really far from good,
Making all allowance for the fact
That we all come from different states and fields.

I don’t complain of trivial mistakes.
I am myself a very sloppy writer,
And almost every paper I submit
Has missing words and blunders. It’s not these 
That makes our papers so damn hard to read, 
But rather imprecision in our choice of words
And absence of concision in our prose.
You need not hammer home the structure of your work
If it's the same old structure we have read
Since we were undergrads. 

	But if you write
About the antelopes that roam the Web
It does behoove you well to know exactly what
An antelope might be, and to distinguish them
From beavers, boojums, snarks and ocelots.
You need not argue ocelots are bad!
We simply want to know how your ideas fit
With what we all already do and know.

Precise word use and thorough scholarship			
Are even more important when, as here,
The audience is drawn from many disciplines.

Our topics — timid, inoffensive, mild –
Will seldom cause  great outrage or surprise.
The times are bad, the provost even worse,
I understand how fear of a false step
Can tempt us to tread light. But still,
It’s not just me: a bunch of you sent mail
To ask about the timing of your talk
So you might fit it in your travel plans
And rush away to give another talk.
I don’t recall a single message sent
To ask about a colleague’s Web Sci work,
And when their talk might be. 

	Why do we come
To conferences like this? To please our dean?
To earn a meager line on our CV?
That’s not the point. 

	I hope we come to learn
To find the best of what is being done
And thought about this complicated Web.

So thanks for coming. Please enjoy the show;
I look forward to learning what you all newly know.

on this date |

WebSci

Designing A Conference: Details

Prototypes For Papers

Agents For Double Checking

Trial By Fire

Designing A Conference With Tinderbox

The Nature Of The Problem

What You Can Do In Advance

Special Events

Pecha Kucha

The Talks

Posters and Presentations

Building Sessions

Cleaning Up

Too Much Philosophy?

WebSci13 Reporting

Choosing The Best

Preamble