
Data and the City: The Promise and Perils of Urban Informatics
By: Constantine E. Kontokosta, Ph.D., PE
July 07, 2015
While the marketing rhetoric around Smart Cities is replete with unfulfilled promises and the persistent use (and mis-use) of the term "big data" has generated confusion and distrust around potential applications, the reality remains that disruptive shifts in ubiquitous data collection — including mobile devices, GPS, social media, and synoptic video — and the ability to store, manage, and analyze massive datasets require the next generation of practitioners of urban policy and planning to have new capabilities that respond to these innovations. For students in the emerging fields of Urban Informatics and Civic Analytics, core competencies cross traditional boundaries of computer and data science, public policy and urban planning, and business and technology management. This results in requirements for both technical and non-technical (or non-computing) skills, as well as breadth and depth across informatics disciplines and domain applications. These include knowledge of programming (Python, R, etc.), data mining and management (Hadoop, MapReduce), applied mathematics and statistics, machine learning, and visualization (CartoDB, D3).
The challenge of training the next generation of city managers and planners to extract useable insight from data extends far beyond programming skills. Students must be knowledgeable about a wide range of urban data — both structured and unstructured — and be aware of the biases inherent in its definition and collection. They must be able to manage and integrate large, disparate datasets and use a range of analytical techniques to interpret and visualize outcomes in ways that can be communicated effectively to non-technical audiences and drive decision-making and policy design. This next generation must also understand city governance, structure, and history sufficiently to identify and assess problems, collect and organize appropriate data, utilize suitable analytical approaches, and ultimately produce results that recognize the constraints faced by city agencies and policymakers. This is not an easy task, and requires an understanding of urban social and political dynamics in addition to a significant appreciation of data governance, privacy, and ethics.

Rendering provided by the Related Companies: Hudson Yards teaming up with CUSP — evidence of the potential for the emerging field of urban informatics
The Center for Urban Science and Progress (CUSP) at New York University was created through New York City's Applied Sciences Initiative to bring the tools of data science and big data analytics to bear on the most pressing urban challenges facing New York City and cities around the globe. CUSP is unique as an academic institution with respect to both its mission-focus and its extensive partnership of public, private, and academic organizations. Most notably, CUSP has partnered with the City of New York in order to work side-by-side with city agencies on identifying and defining operational and policy challenges, assessing data needs and availability, and deploying computational techniques to support the discovery and implementation of potential solutions to these challenges.
At the same time, the use of data to motivate and drive change in the urban context is not without significant obstacles and risks. Much has been said already about the historical tendency (however brief the history) of big data analytics to focus on prediction and correlation, rather than causation — an omission that becomes more pronounced when advancing policy instruments that address social issues in cities. For instance, sophisticated algorithms can be developed to assess and predict community lending risk, but these will inevitably, if not explicitly, capture the historic relationship between neighborhood poverty and race. The result may be a tool that reinforces patterns of segregation because of a narrow focus on correlations, rather than one that enhances our understanding of neighborhood trajectories and improves conditions through greater access to capital.
Privacy and the use of personally identifiable information emerges as another pressing challenge. While much can be learned from mobility tracking of smart devices, the potential infringement on personal privacy cannot and should not be ignored. Instead, a balance must be found between the sensitivity of data collected and its use. Cities must develop a set of use cases that clearly demonstrate the benefits of more widespread data collection, while at the same time ensuring that the least sensitive data needed are used to provide that benefit. This is an ongoing debate — on legal, technical, and moral grounds — and a discussion that needs to be handled in an open, transparent dialogue between city representatives, technology and data providers, data analysts, and the community.
While recognizing these potential pitfalls, the application of data-driven decision-making in cities has proven to be significant. Examples from cities now abound — from detecting illegal residential occupancies in New York City, to dynamic road user-charges in Singapore, to measuring neighborhood health in Chicago, to using GPS-enabled asthma inhalers to identify areas of poor air quality in Louisville. So despite the limitations — and potential negative outcomes — of ignoring causality in the policy and planning context, correlation and prediction can be quite useful tools in the operational environment of cities to support decision-making and resource allocation.
But the opportunity here is not simply a matter of analyzing the data we have. We now know that new methods of data collection are needed, coupled with significant shifts in city governance that support data-driven and evidence-based operational, policy, and planning decisions and public engagement. This will require investment in digital infrastructure to support sensor deployment and data access, novel pedagogy to train future practitioners in applied data analysis, reforms to the organizational structure of cities to accommodate new realities around data-sharing and inter-agency collaboration, and a new approach to how with think about cities and innovation. This final point is in some ways the most critical: we need to think about cities as experimental environments, where we are able to learn through observation, test new ideas and innovations, and ultimately understand their implications across social, economic, and environmental concerns. The Quantified Community initiative is one example of how this concept can be implemented.
This experimental mindset represents perhaps the most profound implication of big data for cities — policymakers can now gain an understanding of city life — of ebbs and flows of consumption, of commuting patterns, of vulnerability and need — at a spatial and temporal granularity unprecedented in history. Taking the observational lens of the city championed by Jane Jacobs in Life and Death of Great American Cities and Holly Whyte's Social Life of Small Urban Spaces, new technology and computational capabilities extend what is knowable about city life far beyond what early urban researchers could conceive.
This unprecedented reach creates the opportunity to test policy interventions and evaluate their impacts across multiple dimensions of critical urban systems, such as human behavior and ecology, while meaningfully engaging the public to both facilitate this understanding and interpret the results. We can then compress public decision-making processes in both time and space to more rapidly address the needs of residents and better evaluate urban policy and planning responses to the significant, and growing, challenges facing cities and their social implications for urban life.
Constantine E. Kontokosta is the NYU Center for Urban Science and Progress (CUSP) Deputy Director for Academics, an Assistant Professor of Urban Infomaticss at CUSP and the School of Engineering, and Head of the Quantified Community Research Lab. He is also the Founding Director of the NYU Center for the Sustainable Built Environment. He holds a Ph.D. and M.S. in Urban Planning, specializing in urban economics, from Columbia University, a M.S. in Real Estate Finance from New York University, and a B.S.E. in Civil Engineering Systems from the University of Pennsylvania. He is a licensed Professional Engineer, a member of the American Institute of Certified Planners, a USGBC LEED Accredited Professional, and has been elected a Fellow of the RICS. He has been named a Fulbright Senior Specialist in the field of Urban Planning and has won awards for Teaching Excellence and Outstanding Service at NYU