Ubiquitous Citizen Programming - Science Direct

25 downloads 58835 Views 155KB Size Report
Sep 21, 2016 - Some of the need can be met by packages and apps that provide ... of end users and end user programmers, IEEE, 2005. doi:10.1109/VLHCC.
Available online at www.sciencedirect.com

ScienceDirect Procedia Computer Science 98 (2016) 169 – 173

The 7th International Conference on Emerging Ubiquitous Systems and Pervasive Networks (EUSPN 2016)

Ubiquitous Citizen Programming Dave Mason Ryerson University, Toronto M5B 2K3, Canada

Abstract Modern society is increasingly mediated by computers. The quantity and diversity of data generated daily is growing at an astounding rate. While the number of people with programming ability is also growing, the percentage of the population with such capability remains in the single digits. This means that we are witnessing a growing gap between need and capacity. Some of the need can be met by packages and apps that provide canned or predefined analysis, but these restrict analyses to those anticipated by the creators of those packages and apps. It is essential that in a data society nearly all citizens have a capacity to program. Therefore, it is critical that we find “programatic” analysis capabilty that is available and accessibly by most citizens. c 2016 Elsevier B.V.access article under the CC BY-NC-ND license  2016Published The Authors. Published © by Elsevier B.V. by This is an open (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Conference Program Chairs. Peer-review under responsibility of the Program Chairs

Keywords: Citizen Programming, Ubiquitous Computing, Data Society

1. Introduction You’re driving home one night and you hit a pothole. Glancing in the rearview mirror you notice that the streetlight is burnt out there. Then you notice another light out and hit another pothole. You wonder if there’s some kind of correlation. Today: you might contact a city newspaper and try to convince them it was an interesting story, so that they could hire a programmer to access the data and produce some kind of report – but more likely, you flake out in front of the TV and forget about it. If you had an environment that allowed you to access a variety of data resources and easily manipulate that data, things might be different. You poke around on the WWW and find that the power company has a web page with a list of all the burnt-out lights, so you mark that. Then you find a poblic database from the city that contains the work-orders to repair broken pavement and you mark that. You write a little program to correlate the data from the power company and the city and display the information in a map. Sure enough you see a correlation, but you also notice that the worst situation appears to be in the poorer parts of town, so you pull in a spreadsheet from the census E-mail address: [email protected]

1877-0509 © 2016 Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Program Chairs doi:10.1016/j.procs.2016.09.027

170

Dave Mason / Procedia Computer Science 98 (2016) 169 – 173

that contains income-by-electoral-disctrict data and sure enough, that’s correlated too. In 15 or 20 mminutes you have transformed some raw, public data into real information that you can share with a news outlet or ask a politician to do something to fix! While there are a few professional programmers who could do this analysis today, it would take them significantly more time than outlined above - so they would be much less likely to do it – and more importantly they might not have noticed the puzzle that motivated you to explore the data in the first place. We believe that as the data available to us becomes more available, and the problems facing us become more challenging, it is critical that most citizens be able to process that data so that the serendipity of observations such as those in our scenario can be capitalized upon. We also believe that citizen oversight of governments and corporations in our increasingly complex and inter-connected world requires a broad-based capability to perform computation.

2. Citizen Programming When we refer to Citizen programmers, we mean primarily amateur programmers that are programming not merely for education or entertainment, but are solving problems of their own devising, or analysing data to make decisions. In 2005 it was estimated 1 that in 2012 there would be 90 million end user programmers, including 13 million describing themselves as programmers. With the U.S. Bureau of Labor Stitistics describing 3 million people with the job description of programmer or feveloper, this indicates between 3 and 13 million people using traditional, text-based programming languages. 2.1. History Over the years there have been many languages and environments that have facilitated a degree of programming for the amateur programmer. These environments have often been met with derision from professional programmers even as they have provided validation, entertainment, and capability to the amateur programmer. 2.1.1. Early Programming The earliest programmers were, of course, not trained programmers. Rather they were mathematicians, engineers, or scientists who wanted to use the fast – if primitive – computers to solve their primarily mathematical problems. They were programming in machine code or assembly language – later FORTRAN – but the computers and hence the programs that ran on them were very small. In 1959, COBOL was created as a first attempt to create a portable language that would be accessible to a large cohort of business analysts with limited formal training in programming. It was relatively successful in this goal with over 1 million people learning to program in it. In 1964, BASIC became available at Dartmough College, and versions of it became available on virtually every personal computer produced in the succeeding decades. It became the gateway language for thousands of future programmers 2 . In the late 1970s, Smalltalk was created with the goal of creating a programming utility device for children, but became largely adopted by professional programmers because of the cost of the systems on which it then ran. 2.1.2. Spreadsheets Prior to 1979, if a manager wanted to include calculated information derived from the company’s computer systems in a report they would have to request and negotiate with the IT department and it could take weeks! Then VisiCalc was released and everything changed. MBAs and others who could manipulate spreadsheets became king, and what-if and scenarios became a mainstay of corporate decision-making. Today, classical spreadsheets are becoming less useful as data becomes too large and too dynamic to usefully capture in spreadsheets. And only a small fraction of the population can use anything near the full power of a spreadsheet!

Dave Mason / Procedia Computer Science 98 (2016) 169 – 173

2.2. Today There are many modern examples of environments that people suggest as suitable for amateur programmers from proprietary languages such as Apex 3 to the ubiquitous Javascript 4,5 . 3. Ubiquitous Computing Even as little as a decade ago, the idea that the majority of adults would carry around a computer everywhere - let alone a computer many times more powerful than standard desktop computers of the day - was pure science-fiction. Additionally programming was widely thought of in very particular terms - it was something that large organizations used to do banking transactions or air-traffic-control. Today with the rise of the teaching and application of Computational Thinking 6 the range of applications for computers and programming are blossoming. While this is being taught broadly across disciplines as a thought tool, it is creating a natural audience of citizens who are disposed to thinking of problems in terms of programming and computer science concepts. 4. Data Society Our technological world is generating and recording data at a phenomenal rate. Some statistics from 2014 7 will demonstrate this, when every day: • • • • • • •

Facebook users share nearly 2.5 million pieces of content. Twitter users tweet nearly 300,000 times. Instagram users post nearly 220,000 new photos. Google gets 4,000,000 search requests. YouTube users upload 72 hours of new video content. Apple users download nearly 50,000 apps. Email users send over 200 million messages.

Using typical sizes for those items, this is a transmission of a couple of terabytes of data - per minute - and has probably doubled in the ensuing 2 years. Add to this the data generated by the billions of automobiles, hundreds of thousands of airplanes, sensors (including CCTV) throughout our cities, and the additional data generated by billions of dollars of commercial transaction taking place on a daily basis, and the total is mind-boggling. While much of this data is proprietary, most world governments have open-data initiatives 4.1. Data Scientist The strategic value of this data is so great that the Harvard Business Review calls Data Scientist the sexiest job of the 21st century. 8 While data science is certainly important 9 , we believe that leaving the data analysis to highly educated, highlypaid, Data Scientists sells us all short, and that we should be working to build similar capability and competency among ordinary citizens. 4.2. Siri, Wolfram-Alpha, and Watson Some would argue that machine learning and “AI”s will address the need that we describe here. However to a great extent, these tools have answers for questions that some programmer has decided are important (or have been asked enough times that they have risen to their attention). This does not leave much room for serendipity, or creative judgement, or even just raw curiousity. To explore the data world along those avenues ultimately requires a citizen capability to program.

171

172

Dave Mason / Procedia Computer Science 98 (2016) 169 – 173

5. Research Direction In section 2, we described a range of languages and environments that have been used by amateur programmers through the years. A 2013 study 10 ran experiments on text-base syntaxes and derived the language Quorum – a language oriented to novice programmers. A 2011 survey 11 of end-user programming tools describes many environments that are being used by what we are calling citizen programmers – mostly spreadsheets. However, none of these was developed specifically for the citizen programmer – an amateur programmer who needs to analyse and make sense of diverse data – or the dynamics and scale of today’s data world. We are trying to create such an environment following a few key principles. 1. Intuitive “syntax”. Our experience with teaching Computer Science for 35 years, and a liberal studies programming course for 3 years has led to the strong opinion that likely not more than 10-15% of the population will ever “get” text-based traditional programming languages. We are currently running experiments to ascertain which visual syntaxes require the least thinking to navigate, and hence are the most intuitive. Our current from runner is a dataflow environment 12,13 but we are also exploring block languages like Scratch 14 and Blockly 15 . 2. Low barrier-to-entry. Many “real” programmers dismiss any non-textual programming environment – they had to work hard to master their language of choice, and they think that that effort is part of the value. But citizen programmers don’t want to “learn to program” – they want to solve problems. Unlike those conventional programmers, if there are significant blocks they will give up (and quite possibly define themselves as too stupid). Collecting data must be natural and trivial, and combining and analysing that data must be just as seamless. 3. Unobtrusive guidance. One of the challenges of data programming is finding useful presentations of results so that the maximum information is available with the least effort. To this end we are running experiments to determine the best forms of presentation for different kinds of data. When you choose a graph or a map, the environment will by default give the best representation of the data, allowing you to quickly and easily see other highly-effective representations. Similarly when trying to make sense of data the environment will suggest ways to correlate the data. 4. Full-stack Platform. No team can possibly build all the pieces that will be needed in all applications – in fact one of our major criticisms of systems mentioned in section 4.2 is that they can never have all the answers. Therefore it is essential that we build our system in layers so that others can solve problems using the same environment – all the way down. This also supports serendipitous discovery and allows users to learn how others solve problems by looking at their code – while ignoring the detail if and until they need need it. We also want to support the full gamut of possible applications from simple data analysis, to artistic programming, to games, to . . . 6. Conclusions We have built a prototype of the system, a team of academic researchers is working on the principles mentioned in section 5, and we are building partnerships with industry. Our system may not become “the” system that provides programming capability to the world, but we are convinced that some such system must become available if we are to reach our potential as a species and address the complex problems of our modern world.

References 1. C. Scaffidi, M. Shaw, B. Myers, Estimating the numbers of end users and end user programmers, IEEE, 2005. doi:10.1109/VLHCC.2005. 34. 2. H. McCraken, Fifty years of basic, the programming language that made computers personal, Time. URL http://time.com/69316/basic/ 3. D. Appleman, The salesforce platform: The return of the citizen programmer (Nov. 2014). URL https://www.simple-talk.com/opinion/opinion-pieces/the-salesforce-platform-the-return-of-thecitizen-programmer/ 4. D. Yang, 4 reasons to learn JavaScript as your first programming language (May 2014). URL http://www.skilledup.com/articles/4-reasons-learn-javascript-first-programming-language 5. A. Birnir, The first programming language you should learn is (Apr. 2014).

Dave Mason / Procedia Computer Science 98 (2016) 169 – 173 6. J. Wing, Computational thinking, J. Comput. Sci. Coll. 24 (6) (2009) 6–7. URL http://dl.acm.org/citation.cfm?id=1529995.1529997 7. S. Gunelius, The data explosion in 2014 minute by minute infographic (Jul. 2014). URL http://aci.info/2014/07/12/the-data-explosion-in-2014-minute-by-minute-infographic/ 8. T. H. Davenport, D. Patil, Data scientist the sexiest job of the 21st century?, Harvard Business Review. URL https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/ 9. What is a data scientist? URL https://www-01.ibm.com/software/data/infosphere/data-scientist/ 10. A. Stefik, S. Siebert, An empirical investigation into programming language syntax, Trans. Comput. Educ. 13 (4) (2013) 19:1–19:40. doi: 10.1145/2534973. URL http://doi.acm.org/10.1145/2534973 11. A. J. Ko, R. Abraham, L. Beckwith, A. Blackwell, M. Burnett, M. Erwig, C. Scaffidi, J. Lawrance, H. Lieberman, B. Myers, M. B. Rosson, G. Rothermel, M. Shaw, S. Wiedenbeck, The state of the art in end-user software engineering, ACM Comput. Surv. 43 (3) (2011) 21:1–21:44. doi:10.1145/1922649.1922658. URL http://doi.acm.org/10.1145/1922649.1922658 12. D. Mason, Flexible structures for end-user programming, in: Proceedings of the 3rd international workshop on Free composition, FREECO ’12, ACM, New York, NY, USA, 2012, pp. 9–11. doi:10.1145/2414716.2414720. URL http://doi.acm.org/10.1145/2414716.2414720 13. D. Mason, Data programming for non-programmers, in: The 4th International Conference on Emerging Ubiquitous Systems and Pervasive Networks, EUSPN 2013, Elsevier Science Publishers B.V. (North-Holland), 2013. 14. J. L. Ford, Scratch Programming for Teens, 1st Edition, Course Technology Press, Boston, MA, United States, 2008. 15. Google, Blockly. URL https://developers.google.com/blockly/

173

Suggest Documents