On December 18th (Tuesday), Wikihub will be unavailable from 7-9am.
Skip to end of metadata
Go to start of metadata

Film is increasingly used across many disciplines (film studies, foreign languages, American cultures, college writing) for a variety purposes beyond filmic devices and narrative techniques, in particular to portray social behaviors and cultural values. In 2008 the Berkeley Language Center began development of a platform, the Library of Foreign Language Film Clips (LFLFC), to deliver film and clips from films for foreign language instruction. In this talk I will discuss the heuristic tools built into the LFLFC, new tools for version 2.0 currently under development, and the challenges of using the LFLFC as a research tool.


When: Thursday, June 1, 2017 from 12 - 1pm
Where: 200C Warren Hall, 2195 Hearst St (see building access instructions on the parent page)
What: Developing Pedagogical and Research Tools for a Film Delivery Platform
Presenting: Mark Kaiser, Berkeley Language Center

Please review prior to the meeting:


Presenting: Mark Kaiser, Berkeley Language Center (BLC)


Aaron Culich, Research IT
Barbara Gilson, SAIT
Chris Hoffman, Research IT
Christian --, University of Munich (guest of Mark Kaiser, visiting UCB)
David Greenbaum, Research IT
Deb McCaffrey, BRC
Jason Christopher, Research IT
John Lowe, Research IT & Linguistics
Larry Conrad, CIO
Maurice Manning, Research IT
Michael Campos-Quinn, BAM/PFA
Nancy Goldman, BAM/PFA
Quinn Dombrowski, Research IT
Rick Jaffe, Research IT (RDM)
Ruth --, ??
Steve Masover, Research IT


Core problem: Foreign language instructors have long known of the value of showing films in foreign languages; but because language learners can be overwhelmed by the unfamiliar language, subtitles have been a common way to soften the 'forced immersion'. But subtitles are not the ideal way to 'mediate' between the film and the language learner.

17,000 clips, 32 languages in BLC's LFLFC. Graduate students catalog, takes about 40 min per clip.

Many partners, including Universities (Princeton, U Munich, etc.) & high schools. Licensing gets complicated. Some films can't be put up on the internet; others can be, but can't be made available to other campuses.

(See slides - coming soon)

2009: DMCA prohibition on circumvention of copy protection waived for all university departments

2014 collaboration with MRC began; this caused some wrinkles, in that database was developed with a schema oriented to foreign language instruction (UI requires as a first action that users choose a language). About 3,000 of the MRC's 60,000 films cataloged in the BLC's catalog to-date, adding about 100/month.

[Demo of the LFLFC application at blcvideoclips@b.e (login required)]

Annotation capability, annotations shared

Challenge: what are the research tools that ought to be built into the v2.0 LFLFC (release in ~2018). Example: how to surface features of spoken English that signal a remark is made sarcastically, and looking at how native-speakers and ESL speakers perceive these signals or don't.



Patrick: Language recognition software?
Mark: Interested to hear from technical folks about this.
Quinn: Pop-up archive (for English) -- worth looking at
Patrick: Won't be perfect, but could make transcription efforts much more efficient, transcribers more productive

Michael: tagging system, controlled vocabularies?
Mark: Guidelines. We don't tag common words that are learned in the first week of instruction. Many differences between languages: personal pronouns in English (no meaning) vs Italian (loaded with meaning). Have a list of tags useful to foreign language instruction.

Patrick: faceted search?
Quinn: query refinement?
Mark: not in current version, slated for v2.0

Patrick: Subtitles in addition to tags?
Mark: We offer what's included in the DVD. Can make them optional if the digital copy we accession permits this.
Patrick: Wonder whether subtitle file could be used to help tagging or topic modeling of these clips and films.

Patrick: automated segmentation suggestions, as a productivity tool?
Mark: they can get complex...

Patrick: data storage?
Mark: Don't know technical details; RAID to replicate; tape backups. 5TB. Every film is ~1G, 100/month -- can be multiplied by different subtitle language versions, production of clips in addition to whole film, etc. Server in a basement room in Dwinelle. Have looked into campus data center; but cost is factor; also worry about uptime issues, as the DC does go down.
Chris: RDM team happy to discuss these issues researchdata@b.e

David: Financial model?
Mark: No charge to other institutions. Issues with Fair Use if we begin to charge.


  • No labels