Mechanical Turkers Aren’t Representative of the 101Questions Userbase, Like At All

Some time ago I thought about bringing the enormous userbase of Mechanical Turk along to rate questions and videos here on 101questions. We could show them your photos and videos and they would ask their questions or simply move to the next if they were bored. Consequently, we could more quickly find more perplexing photos and videos.

As a test, I pulled ten photos out from our database that corresponded to the ten decile marks of the 101questions bank of photos and videos. This selection of photos represents our full range, in other words, from very boring to very perplexing.

Then I showed them to 100 Mechanical Turk users and paid them to answer with a question or a skip. Here are the results.

Photo 101qs Rating Turk Rating
Ticket Roll 81 71
Dueling Discounts 66 83
Dominos 56 73
Rally 52 87
Mural 48 92
Sunflower 43 69
War! 39 74
Dash 34 69
Shot put 28 92
River 23 90

Let me graph that for you.

140318_1

The correlation between our ratings and theirs is basically non-existent and if it exists it’s negative. (ie. the more popular an image is on our site, the less popular it is with Turkers.)

More damning, here’s a distribution of our users and theirs.

140318_3

Our users ask questions at every kind of rate. 8% of our users ask questions 10% of the time, 20% of the time, etc., all the way to 100% of the time. 27% of our users ask questions less than 10% of the time and boredom-skip the rest.

Then you have Turk, where the distribution is almost flipped. 40% of Turkers ask questions all the time. 0% of Turkers skip like we do at the left end of our distribution. The modes are switched. The fact that the Turkers were paid while our users spend their own currency to be here (their time) may explain why our users are so much more discriminating. Whatever that reason, this small test has convinced me that Turkers aren’t a useful proxy for our own userbase.

2014 Mar 19. The data.

101questions Updates, Gets A Lot More Useful

I updated 101questions today to include a single major new feature: a lesson editor.

130319_1

Creating webpages like this soaks up too much of my time. I have to upload files in three different places. Changing a single word in the lesson means firing up an FTP client. Changing anything about an image takes ten minutes at least. None of this is creative work.

So I put together the task editor I want to use. You can add supporting materials — photos, videos, questions, teacher notes, student notes, links, and more. You can re-order them quickly, all from the browser. More fun is that other users can download them quickly. Click the “Download” button and Internet pixies will zip all the resources up and send the file to your computer.

130319_2

I’ve been using it for a couple of weeks and I’d like you to use it also.

I’ve added other features some of you have asked for:

Better tagging.

You can add tags like “pizza” or “basketball” or “money.” You can type a few key mathematical terms into the Common Core search bar and it will locate standards for you. Of course, all of this will make the search engine much smarter.

130319_3

A smarter search engine.

People e-mail now and then telling me in kind terms how awful this spreadsheet is. I’m in total agreement. Unless you’re fluent in Common Core shorthand, it’s impossible to find tomorrow’s topic today. So now you can head to my page on 101questions, click Search, and then click “Search this user.” Type in what you’re looking for. Click “Has lesson” to narrow down my material to everything that’s been a little more developed. Click the grade boxes to tighten the results down even more.

130319_4

Try it out. Add some tags to your old material. Leave me some comments here. I’ll need as much useful criticism as you can offer. Let’s make this great together.

101questions v0.4

The big changes:

  • You can upload files now. No more pasting links to external content. You no longer have to upload your image to Dropbox or ImageShack or anywhere else (an incredibly cumbersome step for a lot of people) just to get material onto 101questions. We’re no longer restricted to YouTube’s hardline interpretation of Fair Use either.
  • You can download files now. Click “Actions” on any uploaded first act and then click “Download.” It’s awesome. It downloads whatever file the user uploaded (it won’t pull down content uploaded to YouTube or Vimeo) along with a text file with all the submitted questions.
  • You can get more responses more quickly by sending your link around. It bummed people out that they’d link to a first act and other people couldn’t add a question unless they saw it randomly come up on the homepage. “You should be able to add a question to the page itself,” they said. I resisted but I was wrong and now you can.

The small changes:

  • A pile of corrections to aspects of the UI that annoyed me, Amazon S3 integration, automatic comment subscription, a lot ground laid for the winter update.

101questions Updates

Here are the top-level updates for 101questions:

  1. You have better quality control rankings. I’m no longer listing the top ten most perplexing people on the site. We can bring that list back if we miss it, but my sense from conversations on this blog and at Stanford is that it was ultimately more divisive than useful. I’ve also split the top ten lists for photos and videos and added a “right now” option alongside the “all time” rankings, so we can see what’s recently perplexing.
  2. You can bookmark questions now. Maybe your first act received eighty questions, but those eighty questions are really only composed of four or five distinct questions. You can now click a bookmark icon and put them in order from most common to least. You’ll help other people (and yourself) get a better sense of the questions people asked about your first act.
  3. Comments. You asked for comments. You got ‘em.
  4. You can delete your own first acts now. Maybe someone’s comment gave you a better idea for your timelapse video of grass growing on your lawn. Now you can delete the old one before you upload the new one.
  5. You have better access to the feedback on all your first acts. I’m really happy with the new “latest” tab in your profile. It has more information — you’ll see questions and skips like before but also comments and bookmarks — in a cleaner layout.
  6. You won’t be able to upload itty-bitty images anymore. The uploader makes sure your pictures are at least the size of the viewing window.
  7. I got rid of Facebook, Twitter, and G+ sharing. No one used them and Twitter uses them to stalk you around the web. So I got rid of them and replaced them with a “Copy Link” option that puts the shortlink on your clipboard. You decide what you want to do with it.
  8. Animated GIFs are now supported.
  9. You can search the site. Something that’s a little fun is that even though you haven’t tagged your first acts in any particular way, other users have. They’re asking questions about your photos and videos and our search engine finds in those questions the semantic goodness it craves. (ie. “Everyone is asking questions about a basketball?” says our genial and dimwitted search engine. “Maybe this first act is about a basketball!”) I’ll be messing with the algorithm over time but ideally, at some point in the near future, you’ll come to the site saying to yourself, “I’d love to motivate completing the square with a video of Australian rugby” (or something equally unlikely) and the site will deliver.

Add in a slew of of performance tweaks and other odds and ends and you have a site update that’s been a long time in the making. If you see anything fun or funny, don’t hesitate to let me know.

101questions Messes With Google Users

People use Google to find answers to questions. 101questions is a website that hosts a lot of questions. This resulted last week in a spike in traffic when a lot of people watched the NCAA men’s championship basketball game and asked Google, “How long are Anthony Davis’ arms?” a question which, as you can see from the top result of that query, we’ve been asking a lot.

I have no other comment, except to feel sorry that 101questions doesn’t (yet) offer any answers to those questions.

Behind The Scenes

[cross-posted to dy/dan]

We’re one week into 101questions and the early feedback has been encouraging. For a certain kind of warped individual (ie. my kind of individual) the experience seems to be, in a word, addicting. It’s also fun to find a non-trivial Swedish contingency jumping aboard. The more effective use we make of visuals, the more we can include learners who speak English as a second language, if they speak it at all.

After a week, 500+ registered users have uploaded 300+ photos and videos which have provoked 10,000+ questions across all users, including a number of unregistered users I haven’t counted. (The analytic component of site administration is right in my wheelhouse, as you can see.) We even have a registered troll, which means we’re halfway to a full-fledged online community.

Here’s a description of where 101questions came from, the problems it tries to solve, and a few notes on where it might go.

Where It Came From

I piloted the idea online in webinars and face-to-face in workshops. I tweaked the constraints and the implementation and arrived at an exercise that teachers found both challenging and fun, which seemed like the right combination. Teachers liked rating photos and videos as perplexing (or not). That same feedback on their own photos and videos helped them improve their eye for perplexity.

I introduced it on Twitter as #anyqs. You’d post a link to a photo or video (hereafter called “the first act”) and ask for questions. That implementation was good for a time, but ultimately very problematic.

Problems 101questions Tries To Solve, In Order Of Importance

Here’s the biggest:

The feedback to your first act is proportional to the quantity of your Twitter followers.

yeah, well that works great with almost 7500 followers. Less well with 4. That arent math teachers.

I have the most followers of anyone who has contributed to the #anyqs tag. I also get the most responses to my photos and videos. That correlation extends all the way down to people with a dozen followers who get very few responses in spite of their work being thoroughly perplexing. That’s a pity.

At 101questions, your first act goes into a huge pile along with mine and both of ours are served up randomly to other users until it gets 100 responses.

People post whatever they want and tag it #anyqs.

I’m talking about full web pages, long, meandering videos, Flash applets, etc. There is a place for all those things, but they all miss the design of the exercise: one photo or one minute of video.

At 101questions, your attempt to upload anything outside of those constraints will get you an invitation to revise and resubmit.

Tweets are fleeting. Perplexity should endure.

We don’t have a record of all the perplexing photos and videos you’ve posted on Twitter. Many of the #anyqs participants likely couldn’t dredge up their own contributions. I’ve saved all of them locally, but that takes a lot of diligence and they’re basically lost to the wind for everybody else. Along those lines, it’s also hard to know if someone has already posted a particular first act.

At 101questions, your contributions are stored in a database and logged in your profile. (Here’s mine.) The application also checks to see if a particular link has already been uploaded and, if so, points you to it. There is a bookmarking feature. You can save first acts for later.

It’s hard to know if you’re bored by my first act or if I just missed you.

I wish there were a “Skip It, I’m Bored” button attached to my #anyqs submissions on Twitter. If responses to my first act are light, I may infer it wasn’t perplexing, but sometimes I wonder if I just queried my followers at the wrong time.

At 101questions, there is a “Skip It, I’m Bored” button.

There isn’t any way to filter for quality.

What was the best photo posted last month? Which people post the best material most consistently? Where can I find their photos and videos? How are we defining “best” anyway? Those questions can’t be answered within our Twitter pilot.

At 101questions, I’ve set up a metric called “perplexity” which amounts to the likelihood your first act will provoke a question. (Technically, it’s the number of questions that have been asked about your first act expressed as a percentage of total skips and questions. 75 means three quarters of everybody who has seen your first act have asked a question about it.)

People post material because it seems vaguely connected to a discipline, not because it provokes a question.

“Interesting” isn’t the same as “perplexing.” “Engaging” is a different animal also. It’s easier to dazzle a student with fireworks than to provoke her to wonder a question. When I’m unperplexed by someone’s #anyqs material on Twitter, I’ll often tweet back, “What question did that photo make you wonder?” In my perfect world, I’d see your own question alongside the first act you uploaded, but only after I submitted my own, so my question is raw and unbiased by yours.

At 101questions, the upload page has fields for a link and a title. Then a blank for your question.

It’s difficult to see other people’s questions about a first act.

If someone tweets a first act I find perplexing, I often want to know if it perplexed other people and, if it did, the questions they asked. That’s difficult on Twitter.

At 101questions, everyone’s questions are logged beneath each first act.

Where This Might Go

Tagging. Searching. Commenting. Top ten lists for “today” and “the last week,” not just “all time.” A mobile application. The ability to submit files from your phone or computer, not just links. Complete mathematical stories, not just the first act. If we’re working on circumference tomorrow, I’d like to go to 101questions, find a list of complete mathematical stories for “circumference” sorted from most perplexing to least, and then download it to my hard drive. Those features will be expensive to develop and sustain. The core feature — getting 100 responses to your first act — will always be free but I may invite you to pay community membership dues for access to the fancier stuff.

Way, Way Behind The Scenes

One of the most annoying features of edu-punditry is how quickly our gurus decide they’ve done absolute everything they can to help us understand and accomplish their vision for learning. They write their blogs, publish their books, tweet their tweets, and give their speeches. Having decided they’ve done everything possible to help us wrap our brains around ideas that are obvious to them, their last recourse is to snark, sarcasm, hectoring, and irrelevancy.

In reality, their messages can almost always be clarified, made easier, more fun, and less expensive. I want nothing to do with that culture of punditry. I can be clearer. I can find new metaphors. I can publish in more media. And I can create tools to make these practices easier. That’s 101questions.

On Rankings

We set up rankings to help you locate more perplexing first acts. To make this process less mysterious, let me explain how 101questions calculates perplexity and then those rankings.

First Acts

For first acts, perplexity is the number of questions asked for every 100 people who see it. Every first act closes to new questions after 100 people see it. Its final perplexity score, then, is the number of questions it has at that point. (ie. if it gets 40 questions and 60 skips, its perplexity score is 40%.)

Once a first act has 25 questions, it’s eligible for the top ten list, which shows you the first acts with the highest perplexity ranking.

People

For people, your perplexity score is the average of the perplexity scores of all your first acts. If you have uploaded two first acts, one of which has a score of 50 and the other 100, your perplexity score is 75.

Once you have uploaded more first acts than the median number of uploads, you’re eligible for the top ten list, which shows you the people with the highest perplexity ranking.

About

We don’t care how well you lecture. We don’t care how well you engage us. We aren’t impressed by your fancy slide transitions or your interactive whiteboard. We care how well you perplex us.

Can you perplex us? Can you show us something that’ll make us wonder a question so intensely we’ll do anything to figure out the answer, including listen to your lecture or watch your slides? Here’s one way to find out. Upload a photo or a video. Find out how many of us get bored and skip it. Find out how many of us get perplexed and ask a question.

Then figure out what you’re going to do to help us answer it.

Signed,

Your Students