In this talk, I will present an overview of my research in the past decade on large scale optimization for machine learning and collective behavior in networked,natural, engineering, and social systems. These collective phenomena include social aggregation phenomena as well as emergence of consensus, swarming, and synchronization in complex network of interacting dynamic systems such as mobile robots and sensors. A common underlying theme in this line of study is to understand how a desired global behavior can emerge from purely local interactions. The evolution of these ideas into social systems has lead to development of a new theory of collective decision making among people and organizations. Examples include participation decisions in uprisings, social cascades, investment decisions in public goods, and decision making in large organizations. I will investigate distributed strategies for information aggregation, social learning and detection problems in networked systems where heterogeneous agents with different observations (with varying quality and precision) coordinate to learn a true state (e.g., finding aggregate statistics or detecting faults and failure modes in spatially distributed wireless sensor networks, or deciding suitability of a political candidate, quality of a product, and forming opinions on social issues of the day in social networks) using a stream of private observations and interaction with neighboring agents. I will end the talk with a a new vision for research and graduate education at the interface of information and decision systems, data science and social sciences.