JSON

Sunday, October 24, 2010

JSON (an acronym for JavaScript Object Notation (pronounced /dʒeɪsɔːn/)) is a lightweight text-based open standard designed for human-readable data interchange. It is derived from the JavaScript programming language for representing simple data structures and associative arrays, called objects. Despite its relationship to JavaScript, it is language-independent, with parsers available for virtually every programming language.
The JSON format was originally specified by Douglas Crockford, and is described in RFC 4627. The official Internet media type for JSON is application/json. The JSON filename extension is .json.
The JSON format is often used for serializing and transmitting structured data over a network connection. It is primarily used to transmit data between a server and web application, serving as an alternative to XML

Data types, syntax and example
JSON's basic types are:
  • Number (integer or real)
  • String (double-quoted Unicode with backslash escaping)
  • Boolean (true or false)
  • Array (an ordered sequence of values, comma-separated and enclosed in square brackets)
  • Object (a collection of key:value pairs, comma-separated and enclosed in curly braces; the key must be a string)
  • null
The following example shows the JSON representation of an object that describes a person. The object has string fields for first name and last name, a number field for age, contains an object representing the person's address, and contains a list (an array) of phone number objects.
{
     "firstName": "John",
     "lastName": "Smith",
     "age": 25,
     "address": 
     {
         "streetAddress": "21 2nd Street",
         "city": "New York",
         "state": "NY",
         "postalCode": "10021"
     },
     "phoneNumber": 
     [
         {
           "type": "home",
           "number": "212 555-1234"
         },
         {
           "type": "fax",
           "number": "646 555-4567"
         }
     ]
 }
A strictly bijective equivalent for the above in XML could be:
<Object>
  <Property><Key>firstName</Key>     <String>John</String></Property>
  <Property><Key>lastName</Key>      <String>Smith</String></Property>
  <Property><Key>age</Key>           <Number>25</Number></Property>
  <Property><Key>address</Key>
    <Object>
      <Property><Key>streetAddress</Key> <String>21 2nd Street</String></Property>
      <Property><Key>city</Key>          <String>New York</String></Property>
      <Property><Key>state</Key>         <String>NY</String></Property>
      <Property><Key>postalCode</Key>    <String>10021</String></Property>
    </Object>
  </Property>
  <Property><Key>phoneNumber</Key>
    <Array>
      <Object>
        <Property><Key>type</Key>          <String>home</String></Property>
        <Property><Key>number</Key>        <String>212 555-1234</String></Property>
      </Object>
      <Object>
        <Property><Key>type</Key>          <String>fax</String></Property>
        <Property><Key>number</Key>        <String>646 555-4567</String></Property>
      </Object>
    </Array>
  </Property>
</Object>
However, the following simplified XML using types (entities) is more likely be used by an XML practitioner:
<Person firstName="John" lastName="Smith" age="25">
  <Address streetAddress="21 2nd Street" city="New York" state="NY" postalCode="10021" />
  <PhoneNumbers>
    <PhoneNumber type="home" number="212 555-1234"/>
    <PhoneNumber type="fax"  number="646 555-4567"/>
  </PhoneNumbers>
</Person>

Note that while both the JSON and XML forms can carry the same data, the (second) XML example also conveys semantic content/meaning.
Since JSON is a subset of JavaScript it is possible (but not recommended) to parse the JSON text into an object by invoking JavaScript's eval() function. For example, if the above JSON data is contained within a JavaScript string variable contact, one could use it to create the JavaScript object p like so:
        
var p = eval("(" + contact + ")");
The contact variable must be wrapped in parentheses to avoid an ambiguity in JavaScript's syntax.
The recommended way, however, is to use a JSON parser. Unless a client absolutely trusts the source of the text, or must parse and accept text which is not strictly JSON-compliant, one should avoid eval(). A correctly implemented JSON parser will accept only valid JSON, preventing potentially malicious code from running.
Modern browsers, such as Firefox 3.5 and Internet Explorer 8, include special features for parsing JSON. As native browser support is more efficient and secure than eval(), it is expected that native JSON support will be included in the next ECMAScript standard
 JSON schema
There are several ways to verify the structure and data types inside a JSON object, much like an XML schema.
JSON Schema[  is a specification for a JSON-based format for defining the structure of JSON data. JSON Schema provides a contract for what JSON data is required for a given application and how it can be modified, much like what XML Schema provides for XML. JSON Schema is intended to provide validation, documentation, and interaction control of JSON data. JSON Schema is based on the concepts from XML Schema, RelaxNG, and Kwalify, but is intended to be JSON-based, so that JSON data in the form of a schema can be used to validate JSON data, the same serialization/deserialization tools can be used for the schema and data, and it can be self descriptive.

 Using JSON in Ajax

The following JavaScript code shows how the client can use an XMLHttpRequest to request an object in JSON format from the server. (The server-side programming is omitted; it has to be set up to respond to requests at url with a JSON-formatted string.)
var my_JSON_object = {}; 
var http_request = new XMLHttpRequest();
http_request.open( "GET", url, true );
http_request.onreadystatechange = function () {
  if (http_request.readyState == 4 && http_request.status == 200){
       my_JSON_object = JSON.parse( http_request.responseText );
  }
};
http_request.send(null);
Note that the use of XMLHttpRequest in this example is not cross-browser compatible; syntactic variations are available for Internet Explorer, Opera, Safari, and Mozilla-based browsers. The usefulness of XMLHttpRequest is limited by the same origin policy: the URL replying to the request must reside within the same DNS domain as the server that hosts the page containing the request. Alternatively, the JSONP approach incorporates the use of an encoded callback function passed between the client and server to allow the client to load JSON-encoded data from third-party domains and to notify the caller function upon completion, although this imposes some security risks and additional requirements upon the server.
Browsers can also use <iframe> elements to asynchronously request JSON data in a cross-browser fashion, or use simple <form action="url_to_cgi_script" target="name_of_hidden_iframe"> submissions. These approaches were prevalent prior to the advent of widespread support for XMLHttpRequest.
Dynamic <script> tags can also be used to transport JSON data. With this technique it is possible to get around the same origin policy but it is insecure. JSONRequest has been proposed as a safer alternative.

 Security issues

Although JSON is intended as a data serialization format, its design as a subset of the JavaScript programming language poses several security concerns. These concerns center on the use of a JavaScript interpreter to dynamically execute JSON text as JavaScript, thus exposing a program to errant or malicious script contained therein—often a chief concern when dealing with data retrieved from the internet. While not the only way to process JSON, it is an easy and popular technique, stemming from JSON's compatibility with JavaScript's eval() function, and illustrated by the following code examples.

 JavaScript eval()

Because all JSON-formatted text is also syntactically legal JavaScript code, an easy way for a JavaScript program to parse JSON-formatted data is to use the built-in JavaScript eval() function, which was designed to evaluate JavaScript expressions. Rather than using a JSON-specific parser, the JavaScript interpreter itself is used to execute the JSON data to produce native JavaScript objects.
Unless precautions are taken to validate the data first, the eval technique is subject to security vulnerabilities if the data and the entire JavaScript environment is not within the control of a single trusted source. If the data is itself not trusted, for example, it may be subject to malicious JavaScript code injection attacks. Also, such breaches of trust may create vulnerabilities for data theft, authentication forgery, and other potential misuse of data and resources. Regular expressions can be used to validate the data prior to invoking eval. For example, the RFC that defines JSON (RFC 4627) suggests using the following code to validate JSON before eval'ing it (the variable 'text' is the input JSON):
  
var my_JSON_object = !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test(
       text.replace(/"(\\.|[^"\\])*"/g, ''))) &&
   eval('(' + text + ')');
A new function, JSON.parse(), has been proposed as a safer alternative to eval, as it is specifically intended to process JSON data and not JavaScript. It was to be included in the Fourth Edition of the ECMAScript standard, though it is available now as a JavaScript library at http://www.JSON.org/json2.js and will be in the Fifth Edition of ECMAScript.[citation needed]

Native JSON

Recent web browsers now either have or are working on native JSON encoding/decoding. This removes the eval() security problem above and also makes it faster because it doesn't parse functions. Native JSON is generally faster compared to the JavaScript libraries commonly used before. As of June 2009 the following browsers have or will have native JSON support:
At least 5 popular JavaScript libraries have committed to use native JSON if available:
Comparison with other formats

 XML

XML is often used to describe structured data and to serialize objects. Various XML-based protocols exist to represent the same kind of data structures as JSON for the same kind of data interchange purposes.
When data is encoded in XML, the result is typically larger in size than an equivalent encoding in JSON, mainly because of XML's closing tags. For example, ignoring irrelevant white space, the JSON encoding above for the "John Smith" data consumes 240 characters whereas the XML encoding consumes 313 characters. In this particular example, the XML encoding requires just over 30% more characters.
However, there are alternative ways to encode the same information. For example, XML allows the encoding of information in attributes, which are enclosed in quotes and thus have no end tags. Using attributes, an alternative XML encoding of the "John Smith" information is as follows:
<Person
  firstName='John'
  lastName='Smith'
  age='25'>
  <Address
    streetAddress='21 2nd Street'
    city='New York'
    state='NY'
    postalCode='10021'
  />
  <PhoneNumbers
    home='212 555-1234'
    fax='646 555-4567'
  />
</Person>
Not counting the irrelevant white-space, and ignoring the required XML header, this XML encoding only requires 200 characters, which may be compared with 215 characters for the equivalent JSON encoding:
{ 
  "Person": 
  {
     "firstName": "John",
     "lastName": "Smith",
     "age": 25,
     "Address": 
     {
        "streetAddress":"21 2nd Street",
        "city":"New York",
        "state":"NY",
        "postalCode":"10021"
     },
     "PhoneNumbers": 
     {
        "home":"212 555-1234",
        "fax":"646 555-4567"
     }
  }
}
The XML encoding may therefore be shorter than the equivalent JSON encoding, notably if the XML document uses attributes rather than elements. The main reason explaining here why the JSON is still longer is that all identifiers (object member names, represented as attribute names or element names in XML) are converted into strings between double quotes. As JSON does not support identifier types, and offers no shortcut for empty strings, all strings that don't contain any separator reserved for the JSON syntax and don't contain compressible whitespaces could be written without any double quotes, which represent here 38 characters (including 16 for string values that don't need any delimitation here): without these quotes, the JSON encoding would need only 181 characters.
But in general, the XML syntax is much longer than JSON because XML attributes are more restricted than text elements in the set of characters they accept, and cannot be used to represent repeated values of the same type with unlimited occurrences, or multiple occurrences of objects of the same type, and cannot represent ordered sequences with separate attributes: instead, XML will require using separate elements, and their recursive embedding will require to duplicate the element name in the closing tag of non-empty elements.
Each element type has also to be represented in a default property of each JSON object (because JSON objects are generic, like in JavaScript): here this is realized by assigning element names to the JSON object property with index "0", and numbering explicitly all elements in the content of an element.
JSON also does not offer a way for unifying objects and ordered (number-indexed) arrays, by automatically numbering object properties (like in PHP associative array initializers) to assign them default indexes (despite the fact that Javascript handles them the same way in expressions), when the XML syntax offers a builtin ordering of elements in the content of an element. Instead it just allows declaring separate arrays, that must be assigned to an explicitly named property of an object.
Beyond size, XML lacks an explicit mechanism for representing large binary data types such as image data (although binary data can be serialized in either case by applying a general-purpose binary-to-text encoding scheme such as one of the Base-64 variants). JSON can represent them using arrays of numbers (representing bytes, or larger integer units up to the precision of 52 bits including the sign bit), because it can exactly represent IEEE-754 64-bit doubles (as specified in the ECMAScript standard).
JSON also still lacks explicit references (something that XML has via extensions like XLink and XPointer, or that XML has natively via parsed external entities declared in the DTD for including external XML objects, or via unparsed external entities declared in the DTD with annotations for referencing any type of external object, or via attributes with IDREF types for internal references to elements in the same document); it also has no standard path notation comparable to XPath.

 PHP array initializers

PHP array initializers[citation needed], that are very similar to JSON objects, can represent exactly the same XML as above without having to specify an explicit numbering for ordering elements in the content of any parent element, using the same array object for representing all properties of an element object (the element name at index 0, all its attributes indexed by their name, and all child element objects in its content starting at index 1) and using string objects to represent all text elements, simply as:
array(
  "Person" => array(
    "firstName" => "John",
    "lastName" => "Smith",
    "age" => 25,
    "Address" => array(
      "streetAddress" => "21 2nd Street",
      "city" => "New York",
      "state" => "NY",
      "postalCode" => "10021"
    ),
    "PhoneNumbers" => array(
      "home" => "212 555-1234",
      "fax" => "646 555-4567"
    )
  )
)
One difference with JSON is that PHP arrays can make distinctions between integer values and floating-points, where JSON, JavaScript and ECMAScript only define a single Number type.
They can also contain references to other objects/arrays and store null references: this requires the use of PHP variables to store references of internal arrays instantiated within the initializer, and dereferencing this variable later in the content of the initializer, in order to create structures (However, it is not possible to create a circular structure or self-referencing array within a single PHP expression like an array initializer, because all members in the array will be instantiated before the array itself is instantiated and assignable into a variable). PHP array initializers can then create any directed acyclic graph from a single root but with possible common branches, when JSON data can only create acyclic tree structures without any common branches.
The boolean constants are represented with integers, but any type can be implicitly converted to a numeric type (strings are parsed and if they can't be converted or are empty, then their numeric value is zero). Then all numeric types or types implicitly convertible to numeric types are convertible to an integer by truncation. All integers are implicitly convertible to booleans (zero is considered false, non-zero is considered true), and other reference value types are considered true if they are non null.

The Basic Idea: Retrieving JSON via Script Tags

It's possible to specify any URL, including a URL that returns JSON, as the src attribute for a <script> tag.
Specifying a URL that returns plain JSON as the src-attribute for a script tag, would embed a data statement into a browser page. It's just data, and when evaluated within the browser's javascript execution context, it has no externally detectable effect.
One way to make that script have an effect is to use it as the argument to a function. invoke( {"Name": "Cheeso", "Rank": 7}) actually does something, if invoke() is a function in Javascript.
And that is how JSONP works. With JSONP, the browser provides a JavaScript "prefix" to the server in the src URL for the script tag; by convention, the browser provides the prefix as a named query string argument in its request to the server, e.g.,
<script type="text/javascript" 
         src="http://server2.example.com/getjson?jsonp=parseResponse">
 </script>
The server then wraps its JSON response with this prefix, or "padding", before sending it to the browser. When the browser receives the wrapped response from the server it is now a script, rather than simply a data declaration. In this example, what is received is
parseResponse({"Name": "Cheeso", "Rank": 7})
...which can cause a change of state within the browser's execution context, because it invokes a method.

 The Padding

While the padding (prefix) is typically the name of a callback function that is defined within the execution context of the browser, it may also be a variable assignment, an if statement, or any other Javascript statement prefix.

Script Tag Injection

But to make a JSONP call, you need a script tag. Therefore, for each new JSONP request, the browser must add a new <script> tag—in other words, inject the tag—into the HTML DOM, with the desired value for the src attribute. This element is then evaluated, the src URL is retrieved, and the response JSON is evaluated.
In that way, the use of JSONP can be said to allow browser pages to work around the same origin policy via script tag injection.

 Basic Security concerns

Because JSONP makes use of script tags, calls are essentially open to the world. For that reason, JSONP may be inappropriate for carrying sensitive data. Including script tags from remote sites allows the remote sites to inject any content into a website. If the remote sites have vulnerabilities that allow JavaScript injection, the original site can also be affected.

 Cross-site request forgery

Naïve deployments of JSONP are subject to cross-site request forgery attacks (CSRF or XSRF).[23] Because the HTML <script> tag does not respect the same origin policy in web browser implementations, a malicious page can request and obtain JSON data belonging to another site. This will allow the JSON-encoded data to be evaluated in the context of the malicious page, possibly divulging passwords or other sensitive data if the user is currently logged into the other site.
This is only a problem if the JSON-encoded data contains sensitive information that should not be disclosed to a third party, and the server depends on the browser's Same Origin Policy to block the delivery of the data in the case of an improper request. There is no problem if the server determines the propriety of the request itself, only putting the data on the wire if the request is proper. Cookies are not by themselves adequate for determining if a request was authorized. Exclusive use of cookies is subject to cross-site request forgery

FOR FURTHER INFORMATION
http://www.json.org/
http://www.javapassion.com/ajax/JSON.pdf
http://jsonformatter.curiousconcept.com/
http://www.roseindia.net/tutorials/json/
http://www.secretgeek.net/json_3mins.asp

0 comments:

Post a Comment