James Salsman is a statistician and software engineer with over 30 years of speech, signal processing, C, Python, Perl, Javascript, R, SQL, Tcl/Tk, WebRTC, and related experience. He is currently working on speech recognition for pronunciation evaluation, helping people learn to speak and read well. Salsman's contributions to open source software include substantial improvements to the phase vocoder algorithm efficiency, upgrades to TCL and Android, and work to patch and extend Mediawiki. He is currently working on speech recognition for pronunciation evaluation, helping people learn to speak and read well. He studied computer science and statistics at Carnegie Mellon University.

Other interests include:

  • Hack the Future (volunteer mentor)
  • Google Summer of Code (volunteer mentor)

Selected highlightsEdit

17zuoye.com, 2017: 30 million K-6 English as a Second Language student customers of independent commercial homework app assigned by teachers selecting from competing products in every province of China.

EF Education First, EF Learning Labs, Shanghai, China, 2013–2014: Improved automatic speech recognition (ASR) systems providing pronunciation assessment for English language learning by diagnosing Adobe Flash-based microphone upload channel faults, immediately reversing a 30% accuracy drop prior to my arrival. Architected, validated, and implemented further pronunciation assessment accuracy improvement using Sensory Fluentsoft ASR with phoneme duration and acoustic scores normalized by establishing a leaderboard of exemplar pronunciations from student uploads, achieving a 24% increase in scores’ agreement with a panel of human judges. Prototyped auditory feedback for pronunciation exercises, designed ASR QA systems, and additional word and phrase score improvements on cross-platform mobile and desktop ASR implementations. Several other contributions to processes, internal technical documentation, and online learning functions. Used C, JavaScript, sh, C#, and ObjectiveC on Android, iOS, Linux servers, Windows ASP.NET servers and desktop, and OS X.

Selected publicationsEdit

Yuan Gao, Brij Mohan Lal Srivastava, James Salsman (2017) "Spoken English Intelligibility Remediation with PocketSphinx Alignment and Feature Extraction Improves Substantially over the State of the Art." In press: https://arxiv.org/abs/1709.01713

J. Salsman (July 2014) “Development challenges in automatic speech recognition for computer assisted pronunciation teaching and language learning” in Proceedings of the Research Challenges in Computer Aided Language Learning Conference (CALL 2014) Antwerp, Belgium: talknicer.com/Salsman-CALL-2014.pdf

S. Ronanki, J. Salsman, and L. Bo (December 2012) “Automatic Pronunciation Evaluation and Mispronunciation Detection using CMU Sphinx.” in Proceedings of the Workshop on Speech and Language Processing Tools in Education, pp. 61–68. 24th International Conference on Computational Linguistics (COLING 2012) Mumbai, India: www.aclweb.org/anthology/W12-5808

K. Roast and J. Salsman (August 2011) “K3D JavaScript Canvas Library.” Software documentation: en.wikibooks.org/wiki/K3D_JavaScript_Canvas_Library

J. Salsman (May 2010) “Asynchronous Microphone Upload – for Pronunciation Assessment, High-Quality, Low-Bandwidth Voice, Speech Transcription, Translation, and Speaker Identification and Verification.” in the Proceedings of the World Wide Web Consortium Workshop on Conversational Applications (W3C CONVAPPS) June 18–19, 2010, Somerset, New Jersey: www.w3.org/2010/02/convapps/Papers/asynchMicUpload.pdf

J. Salsman (October 2010) “Teaching computers to teach people to read and speak.” One Laptop Per Child San Francisco Bay Area Community Summit (OLPC-SF 2010) presentation. San Francisco, California: talknicer.com/olpcsf.pdf

J. Salsman (2005) “ReadSay PROnounce English System.” Self-published commercial software and instructional modules: talknicer.com/pronounce

J. Salsman (August 2004) “Getting Sorted Indices out of lsort.” Tcl Improvement Proposal (TCL TIP) #217. Tcl Developer Xchange: www.tcl.tk/cgi-bin/tct/tip/217.html

 J. P. Salsman (July 1999) “Form-based Device Input and Upload in HTML.” World Wide Web Consortium Note submission from Cisco Systems, San Jose, California:  www.w3.org/TR/device-upload

J. Salsman and H. Alvestrand (May 1999) “The Audio/L16 MIME content type.” Internet Engineering Task Force Request for Comments (IETF RFC 2586) www.ietf.org/rfc/rfc2586.txt

Interested inEdit

I study en:Wikipedia:Short popular vital articles and apply what I have learned to improve the encyclopedia and execute the Foundation Mission in ways that other people may have missed. I took en:Plug-in hybrid to featured article status, en:Birth control and en:Feminist economics to good article status, wrote en:Carbon-neutral fuel, like Google/Alphabet Project Foghorn and natural gas power plant flue exhaust recycling when they are powered by discounted nighttime wind. I also study the relationship between inequality and health, the Making Work Pay tax credit, free community college, universal health care, and BT mosquito abatement with both sinking and floating spores. I don't like neonicotinoid pesticides, monopolies, or gerrymandering. I like sliding scale compulsory royalties, black hole dark matter, supercooled vitrification, colonizing Titan and pronunciation assessment for language learning. I started editing in 2005.

To that end, please endorse or critique my Grants:Project/Intelligibility transcriptions grant proposal, or both.

The quality of the encyclopedia and the execution of the Foundation's Mission are of paramount importance to me.


James Salsman (talk / mailto:jim@talknicer.com)

User language
en-N This user has a native understanding of English.
Users by language