<<O>>  Difference Topic XAYAThinkingWithData (r1.21 - 21 Jan 2005 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

Line: 21 to 21

To learn more about working in this TWiki, please go to HarmenyTWikiGuest

Added:
>
>
The information on this page is more technical in nature, a "developers journal" of ideas going into XAYA. A simpler, user friendly introduction to XAYA that doesen't assume Python or Programming knowledge is at: XAYAXaYa

This site has been set up to support a Python Software Foundation grant proposal, and contains a basic description of XAYA's goals, philosophy, concepts, proposed methods, and some early code examples (code's going to be fairly thoroughly reworked this weekend -- mb Sept 30, 2004).

Original PSF Grant Application

 <<O>>  Difference Topic XAYAThinkingWithData (r1.20 - 10 Dec 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

Line: 19 to 19

Return to WebHome

Changed:
<
<
To learn more about working in this TWiki go to HarmenyTWikiGuest
>
>
To learn more about working in this TWiki, please go to HarmenyTWikiGuest

This site has been set up to support a Python Software Foundation grant proposal, and contains a basic description of XAYA's goals, philosophy, concepts, proposed methods, and some early code examples (code's going to be fairly thoroughly reworked this weekend -- mb Sept 30, 2004).

 <<O>>  Difference Topic XAYAThinkingWithData (r1.19 - 02 Dec 2004 - TWikiGuest)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

Line: 31 to 31

Current Code, Data, and Testing Examples (see bottom of page for more code and usage examples)

Added:
>
>
  • CodeInHTML
    • The latest code base (November 30, 2004) is here for viewing as an HTML file.

 <<O>>  Difference Topic XAYAThinkingWithData (r1.18 - 02 Nov 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

Line: 31 to 31

Current Code, Data, and Testing Examples (see bottom of page for more code and usage examples)

Changed:
<
<
>
>

If you have questions and/or if you're interested in contributing to the XAYA project, drop me a line at mishtu@harmeny.com with "XAYA" in the subject line

Line: 41 to 43

Changed:
<
<
-- MishtuBanerjee - 13 Oct 2004
>
>
-- MishtuBanerjee - 01 Nov 2004

XAYAvision

Line: 529 to 531

These are some initial code, developed to illustrate ideas for XAYA in its initial iteration. These should not be taken as the final code, for the first iteration, but really a working out of code conventions and initial command line interface via proto-typing. Expect the code to be updated every few days. Top link, is most recent version of code (so one gets a running view of the developing style). Current conventions are to use a test-first development style (define unit tests, write code)via unittest module, a pseudo-literate coding style that provides documentation of any algorithms used, and references to code sources, and finally incorporation of usage examples via docTest module. Code is currently more verbose than it needs to be, and will get stripped down. Try and balance conciseness and clarity. (emphasis on "try").
Added:
>
>


Added:
>
>


Added:
>
>

Added:
>
>


Added:
>
>



Line: 575 to 589

Added:
>
>


META FILEATTACHMENT devcore_20040826.zip attr="" comment="Prototypes as of August 26, 2004" date="1096586045" path="C:\XAYA\devcore_20040826.zip" size="8585" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040914.zip attr="" comment="Prototypes as of September 14, 2004" date="1096586069" path="C:\XAYA\devcore_20040914.zip" size="21282" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040921.zip attr="" comment="Prototypes as of September 21, 2004" date="1096586105" path="C:\XAYA\devcore_20040921.zip" size="146169" user="utsim" version="1.1"
Line: 588 to 603

META FILEATTACHMENT ThinkingWithGraphs?_20041011.pdf attr="" comment="A visual introduction to thinking with graphs" date="1097705990" path="C:\XAYA\XAYAextension\ThinkingWithGraphs_20041011.pdf" size="740604" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20041029.zip attr="" comment="Prototypes as of October 29, 2004" date="1099095283" path="C:\XAYA\devcore_20041029.zip" size="306954" user="utsim" version="1.1"
META FILEATTACHMENT XAYAcoreUsageExample?_20041029.doc attr="" comment="Command Line Usage Example %_Q_%findRootsLeaves%_Q_%" date="1099095339" path="C:\XAYA\XAYAextension\UsageExamples\XAYAcoreUsageExample_20041029.doc" size="82944" user="utsim" version="1.1"
Added:
>
>
META FILEATTACHMENT devcore_20041101.zip attr="" comment="Prototypes as of November 1, 2004" date="1099357377" path="C:\XAYA\devcore_20041101.zip" size="522664" user="utsim" version="1.1"
META FILEATTACHMENT XAYAcoreUsageExample?_20041101.doc attr="" comment="Example of using Relation Operations to Program" date="1099357517" path="C:\XAYA\XAYAextension\UsageExamples\XAYAcoreUsageExample_20041101.doc" size="421376" user="utsim" version="1.1"
 <<O>>  Difference Topic XAYAThinkingWithData (r1.17 - 30 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

Line: 19 to 19

Return to WebHome

Added:
>
>
To learn more about working in this TWiki go to HarmenyTWikiGuest

This site has been set up to support a Python Software Foundation grant proposal, and contains a basic description of XAYA's goals, philosophy, concepts, proposed methods, and some early code examples (code's going to be fairly thoroughly reworked this weekend -- mb Sept 30, 2004).

Original PSF Grant Application

Line: 31 to 33

Current Code, Data, and Testing Examples (see bottom of page for more code and usage examples)
Added:
>
>
If you have questions and/or if you're interested in contributing to the XAYA project, drop me a line at mishtu@harmeny.com with "XAYA" in the subject line

 <<O>>  Difference Topic XAYAThinkingWithData (r1.16 - 30 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

Line: 28 to 28

  • Based on feedback recieved from some friends after I submitted the original proposal, I did a slight revision to (i) add more discussion on Analysis Patterns and (ii) clarify the payment plan, as a cost/per/module, (ii) corrections to typos, and refining text for reading flow.
  • psfgrantXAYA_ThinkingWithData_20041013.pdf: Small Revisions to PSF Proposal for Readability
Added:
>
>
Current Code, Data, and Testing Examples (see bottom of page for more code and usage examples)

Line: 129 to 131

A more detailed discussion will be placed later at XAYAConceptsExplained

Changed:
<
<
>
>



Line: 522 to 525

These are some initial code, developed to illustrate ideas for XAYA in its initial iteration. These should not be taken as the final code, for the first iteration, but really a working out of code conventions and initial command line interface via proto-typing. Expect the code to be updated every few days. Top link, is most recent version of code (so one gets a running view of the developing style). Current conventions are to use a test-first development style (define unit tests, write code)via unittest module, a pseudo-literate coding style that provides documentation of any algorithms used, and references to code sources, and finally incorporation of usage examples via docTest module. Code is currently more verbose than it needs to be, and will get stripped down. Try and balance conciseness and clarity. (emphasis on "try").
Added:
>
>

Line: 562 to 568

Added:
>
>


META FILEATTACHMENT devcore_20040826.zip attr="" comment="Prototypes as of August 26, 2004" date="1096586045" path="C:\XAYA\devcore_20040826.zip" size="8585" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040914.zip attr="" comment="Prototypes as of September 14, 2004" date="1096586069" path="C:\XAYA\devcore_20040914.zip" size="21282" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040921.zip attr="" comment="Prototypes as of September 21, 2004" date="1096586105" path="C:\XAYA\devcore_20040921.zip" size="146169" user="utsim" version="1.1"
Line: 573 to 582

META FILEATTACHMENT devcore_20041011.zip attr="" comment="Prototypes as of October 11, 2004" date="1097705124" path="C:\XAYA\devcore_20041011.zip" size="225303" user="utsim" version="1.1"
META FILEATTACHMENT XAYcoreUsageExample?_20041006.doc attr="" comment="Command Line Usage Example %_Q_%findAllPaths%_Q_%" date="1097705357" path="C:\XAYA\XAYAextension\UsageExamples\XAYcoreUsageExample_20041006.doc" size="75264" user="utsim" version="1.1"
META FILEATTACHMENT ThinkingWithGraphs?_20041011.pdf attr="" comment="A visual introduction to thinking with graphs" date="1097705990" path="C:\XAYA\XAYAextension\ThinkingWithGraphs_20041011.pdf" size="740604" user="utsim" version="1.1"
Added:
>
>
META FILEATTACHMENT devcore_20041029.zip attr="" comment="Prototypes as of October 29, 2004" date="1099095283" path="C:\XAYA\devcore_20041029.zip" size="306954" user="utsim" version="1.1"
META FILEATTACHMENT XAYAcoreUsageExample?_20041029.doc attr="" comment="Command Line Usage Example %_Q_%findRootsLeaves%_Q_%" date="1099095339" path="C:\XAYA\XAYAextension\UsageExamples\XAYAcoreUsageExample_20041029.doc" size="82944" user="utsim" version="1.1"
 <<O>>  Difference Topic XAYAThinkingWithData (r1.15 - 24 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

 <<O>>  Difference Topic XAYAThinkingWithData (r1.14 - 24 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

Line: 39 to 39

XAYAvision

Changed:
<
<
In A Nutshell. XAYA is to be a Pythonic logic-based mini- language for thinking with data, and this site is to document the creation of XAYA as an extended tutorial of “Thinking with Data in Python”. Thinking with Data is a trial-and-error process of exploration of relationships amongst data (in this case “data” can mean everything from numerical observations, to semantic ontologies). One asks questions (queries), one gets answers (query result sets). Is the answer what you expected? No? Revise, and repeat. A great deal of planning, “business intelligence” and analysis work in various industries requires just such “thinking with data”, where questions become iteratively refined based on previous results. One of XAYA’s key goals is to provide a framework that can introduce Python to that business environment, and in particular to planners and analysts who are not primarily programmers, but who are required to reason with complex data and make valid inferences.
>
>
In A Nutshell:

XAYA is to be a Pythonic logic-based mini- language for thinking with data, and this site is to document the creation of XAYA as an extended tutorial of “Thinking with Data in Python”. Thinking with Data is a trial-and-error process of exploration of relationships amongst data (in this case “data” can mean everything from numerical observations, to semantic ontologies). One asks questions (queries), one gets answers (query result sets). Is the answer what you expected? No? Revise, and repeat. A great deal of planning, “business intelligence” and analysis work in various industries requires just such “thinking with data”, where questions become iteratively refined based on previous results. One of XAYA’s key goals is to provide a framework that can introduce Python to that business environment, and in particular to planners and analysts who are not primarily programmers, but who are required to reason with complex data and make valid inferences.


XAYA is centred around the idea of representing information as graphs, and being able to "navigate" through those graphs. This idea originated in the insight that relational systems (such as databases) can be modelled as graphs, and a query in such a system represents a "path through a graph".

Line: 52 to 54

Changed:
<
<
A Language for thinking with data. Data is everywhere, we try and make sense of it. But we keep getting caught up in the details. Here's a brief overview of the details we end up worrying about when thinking about data (as a prelude to thinking with data)
>
>
A Language for thinking with data:

Data is everywhere, we try and make sense of it. But we keep getting caught up in the details. Here's a brief overview of the details we end up worrying about when thinking about data (as a prelude to thinking with data)


First we have to worry about "what format the data is in, and how to access it". Is the data in a database? Is it in XML format? Is there a DTD, a Schema?

Line: 73 to 77

  • A query engine for "finding paths through graphs". Given a mini-language for dealing with graphs, and a template for developing models in our system, we need a way to ask increasingly sophisticated questions. At the simplest level, this means supporting "finding paths through graphs". At higher levels of sophistication, the act of ffinding paths trhough graphs gradually becomes the basis of querying, and at an even more sophisticated level, a basis for "declarative programming" which focusses on defining "what" the solution is, while allowing an automated system (the query or inference engine) to figure out how to satisfy them. For a capsule summary of declarative programming, and comparison to some other styles, go to Wikipedia
Changed:
<
<
Freedom, Simplicity, Connection. These three words summarize the spirit behind the vision for XAYA. One day, I was talking to a friend, and trying to explain the technical ideas behind XAYA. Graphs. Analysis Patterns. Declarative Languages. Yada Yada Yada. I watched his eyes begin to glaze over. Desperately, I said, "well it's really about freedom, simplicity, connection". The pupils shifted back into focus a bit. "What do you mean". Well freedom as in freedom to explore. In the limited field of data and information -- how do we give people that freedom to explore. Simplicity -- as in make it as simple as possible and no simpler. Connection as in, connecting pieces of information together -- being able to link the pieces in a chain of thought from instances to ontologies. And then it struck me. Freedom-simplicity-connection, are not technical points, but my hopes for the kind of community spirit that might grow around such a project. Who has information they need to organize? Well, businesses of course, with their enterprise information systems. Scientists with large collaborative research projects. Those are the obvious ones. But what about poets building models of simile. Visual artists organizing scraps of imagery for future transformations. Can we build a system that attracts not only software engineers, data analysts and "domain specialists", but reaches out to poets, painters, singers, dancers, recipe collectors, stamp savers, the persistently forgetful .....
>
>
Freedom, Simplicity, Connection:

These three words summarize the spirit behind the vision for XAYA. One day, I was talking to a friend, and trying to explain the technical ideas behind XAYA. Graphs. Analysis Patterns. Declarative Languages. Yada Yada Yada. I watched his eyes begin to glaze over. Desperately, I said, "well it's really about freedom, simplicity, connection". The pupils shifted back into focus a bit. "What do you mean". Well freedom as in freedom to explore. In the limited field of data and information -- how do we give people that freedom to explore. Simplicity -- as in make it as simple as possible and no simpler. Connection as in, connecting pieces of information together -- being able to link the pieces in a chain of thought from instances to ontologies. And then it struck me. Freedom-simplicity-connection, are not technical points, but my hopes for the kind of community spirit that might grow around such a project. Who has information they need to organize? Well, businesses of course, with their enterprise information systems. Scientists with large collaborative research projects. Those are the obvious ones. But what about poets building models of simile. Visual artists organizing scraps of imagery for future transformations. Can we build a system that attracts not only software engineers, data analysts and "domain specialists", but reaches out to poets, painters, singers, dancers, recipe collectors, stamp savers, the persistently forgetful .....

To meet this vision, requires balancing very general concepts about knowledge, information and logic with specific details of query systems, data mini-language design, specific data formats ...... The goal is to balance high level and low level perspectives, and avoid collisions between those perspectives. In the end, we must hide the details from those who don't need them; and make them exposable to those who do. When the software engineering's "How" begins to struggle with the logician's "What" and the humanist's "Why Not!", perhaps it is time for poetry to intervene. Indeed, Richard Gabriel in a recent interview on the Sun site has called for an MFA in The Poetry of Programming.


Changed:
<
<
To meet this vision, requires balancing very general concepts about knowledge, information and logic with specific details of query systems, data mini-language design, specific data formats ...... The goal is to balance high level and low level perspectives, and avoid collisions between those perspectives. In the end, we must hide the details from those who don't need them; and make them exposable to those who do. When the software engineering's "How" begins to struggle with the logician's "What" and the humanist's "Why Not!", perhaps it is time for poetry to intervene:
>
>
In that poetic spirit, here is a small metaphoric guidance algorithm:

Algorithm for Collision Avoidance

 <<O>>  Difference Topic XAYAThinkingWithData (r1.13 - 24 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

Line: 73 to 73

  • A query engine for "finding paths through graphs". Given a mini-language for dealing with graphs, and a template for developing models in our system, we need a way to ask increasingly sophisticated questions. At the simplest level, this means supporting "finding paths through graphs". At higher levels of sophistication, the act of ffinding paths trhough graphs gradually becomes the basis of querying, and at an even more sophisticated level, a basis for "declarative programming" which focusses on defining "what" the solution is, while allowing an automated system (the query or inference engine) to figure out how to satisfy them. For a capsule summary of declarative programming, and comparison to some other styles, go to Wikipedia
Changed:
<
<
Freedom, Simplicity, Connection. These three words summarize the spirit behind the vision for XAYA. One day, I was talking to a friend, and trying to explain the technical ideas behind XAYA. Graphs. Analysis Patterns. Declarative Languages. Yada Yada Yada. I watched his eyes begin to glaze over. Desperately, I said, "well it's really about freedom, simplicity, connection". The pupils shifted back into focus a bit. "What do you mean". Well freedom as in freedom to explore. In the limited field of data and information -- how do we give people that freedom to explore. Simplicity -- as in make it as simple as possible and no simpler. Connection as in, connecting pieces of information together -- being able to link the pieces in a chain of thought from instances to ontologies. And then it struck me. Freedom-simplicity-connection, are not technical points, but my hopes for the kind of community spirit that might grow around such a project.
>
>
Freedom, Simplicity, Connection. These three words summarize the spirit behind the vision for XAYA. One day, I was talking to a friend, and trying to explain the technical ideas behind XAYA. Graphs. Analysis Patterns. Declarative Languages. Yada Yada Yada. I watched his eyes begin to glaze over. Desperately, I said, "well it's really about freedom, simplicity, connection". The pupils shifted back into focus a bit. "What do you mean". Well freedom as in freedom to explore. In the limited field of data and information -- how do we give people that freedom to explore. Simplicity -- as in make it as simple as possible and no simpler. Connection as in, connecting pieces of information together -- being able to link the pieces in a chain of thought from instances to ontologies. And then it struck me. Freedom-simplicity-connection, are not technical points, but my hopes for the kind of community spirit that might grow around such a project. Who has information they need to organize? Well, businesses of course, with their enterprise information systems. Scientists with large collaborative research projects. Those are the obvious ones. But what about poets building models of simile. Visual artists organizing scraps of imagery for future transformations. Can we build a system that attracts not only software engineers, data analysts and "domain specialists", but reaches out to poets, painters, singers, dancers, recipe collectors, stamp savers, the persistently forgetful .....

To meet this vision, requires balancing very general concepts about knowledge, information and logic with specific details of query systems, data mini-language design, specific data formats ...... The goal is to balance high level and low level perspectives, and avoid collisions between those perspectives. In the end, we must hide the details from those who don't need them; and make them exposable to those who do. When the software engineering's "How" begins to struggle with the logician's "What" and the humanist's "Why Not!", perhaps it is time for poetry to intervene:

Algorithm for Collision Avoidance

the specifics and the generals

are like a flock of birds and their shadow

within the flock each bird is particular, isolate

  • focussed on avoiding collision with
  • its nearest neighbours

from the ground individuals meld

  • their racaus squawks go unheard
  • but we see, they are indeed
  • travelling north, at steady velocity
  • as a cohesive entity: the flock

Generals are the magnetite in a birds skull, pointing north

Specifics are the wings, getting you there.

Along the way, lets have some fun!


 <<O>>  Difference Topic XAYAThinkingWithData (r1.12 - 23 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

 <<O>>  Difference Topic XAYAThinkingWithData (r1.11 - 21 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYA -- Thinking with Data

Line: 41 to 41

In A Nutshell. XAYA is to be a Pythonic logic-based mini- language for thinking with data, and this site is to document the creation of XAYA as an extended tutorial of “Thinking with Data in Python”. Thinking with Data is a trial-and-error process of exploration of relationships amongst data (in this case “data” can mean everything from numerical observations, to semantic ontologies). One asks questions (queries), one gets answers (query result sets). Is the answer what you expected? No? Revise, and repeat. A great deal of planning, “business intelligence” and analysis work in various industries requires just such “thinking with data”, where questions become iteratively refined based on previous results. One of XAYA’s key goals is to provide a framework that can introduce Python to that business environment, and in particular to planners and analysts who are not primarily programmers, but who are required to reason with complex data and make valid inferences.

Added:
>
>
XAYA is centred around the idea of representing information as graphs, and being able to "navigate" through those graphs. This idea originated in the insight that relational systems (such as databases) can be modelled as graphs, and a query in such a system represents a "path through a graph".

Deleted:
<
<
Freedom, Simplicity, Connection. These three words summarize the spirit behind the vision for XAYA. XAYA is centred around the idea of representing information as graphs, and being able to "navigate" through those graphs. This idea originated in the insight that relational systems (such as databases) can be modelled as graphs, and a query in such a system represents a "path through a graph".

Changed:
<
<
XAYA takes that basic idea "a query is a path through a graph" and generalizes it so that the query engine no longer is concerned about a particular storage format (such as relational databases, XML, RDF etc). One should focus on the information, and it's inherent relationships, rather than underlying format. In this context, we want to try and apply a set of design principles that are focussed on Logic, Abstraction, Graphs, Language.
>
>
XAYA's goal is to take that basic idea "a query is a path through a graph" and generalizes it so that the query engine no longer is concerned about a particular storage format (such as relational databases, XML, RDF etc). One should focus on the information, and it's inherent relationships, rather than underlying format. In this context, we want to try and apply a set of design principles that are focussed on Logic, Abstraction, Graphs, Language.

An earlier version of this idea was implemented in LifeLine an application that focussed on representing databases as "tree-like" graphs, and allowed the user to query a database in fairly sophisticated ways without requiring knowledge of the underlying database structure, SQL, etc. At it's centre was a query engine that translates between what the user has selected as a "scenario", a graph representation of the data, and the SQL needed to obtain the "scenario" from a relational database.

Added:
>
>


A Language for thinking with data. Data is everywhere, we try and make sense of it. But we keep getting caught up in the details. Here's a brief overview of the details we end up worrying about when thinking about data (as a prelude to thinking with data)

First we have to worry about "what format the data is in, and how to access it". Is the data in a database? Is it in XML format? Is there a DTD, a Schema?

Line: 70 to 73

  • A query engine for "finding paths through graphs". Given a mini-language for dealing with graphs, and a template for developing models in our system, we need a way to ask increasingly sophisticated questions. At the simplest level, this means supporting "finding paths through graphs". At higher levels of sophistication, the act of ffinding paths trhough graphs gradually becomes the basis of querying, and at an even more sophisticated level, a basis for "declarative programming" which focusses on defining "what" the solution is, while allowing an automated system (the query or inference engine) to figure out how to satisfy them. For a capsule summary of declarative programming, and comparison to some other styles, go to Wikipedia
Added:
>
>
Freedom, Simplicity, Connection. These three words summarize the spirit behind the vision for XAYA. One day, I was talking to a friend, and trying to explain the technical ideas behind XAYA. Graphs. Analysis Patterns. Declarative Languages. Yada Yada Yada. I watched his eyes begin to glaze over. Desperately, I said, "well it's really about freedom, simplicity, connection". The pupils shifted back into focus a bit. "What do you mean". Well freedom as in freedom to explore. In the limited field of data and information -- how do we give people that freedom to explore. Simplicity -- as in make it as simple as possible and no simpler. Connection as in, connecting pieces of information together -- being able to link the pieces in a chain of thought from instances to ontologies. And then it struck me. Freedom-simplicity-connection, are not technical points, but my hopes for the kind of community spirit that might grow around such a project.

 <<O>>  Difference Topic XAYAThinkingWithData (r1.10 - 20 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg
Changed:
<
<

XAYAThinkingWithData

>
>

XAYA -- Thinking with Data


Added:
>
>
XAYAvision: "Freedom Simplicity Connection"

Added:
>
>
XAYAcore: "A Language for Thinking with Data"

Deleted:
<
<
XAYAcore: "A Language for Thinking with Data"

Deleted:
<
<
XAYAvision: "Freedom Simplicity Connection"

Line: 28 to 28

  • Based on feedback recieved from some friends after I submitted the original proposal, I did a slight revision to (i) add more discussion on Analysis Patterns and (ii) clarify the payment plan, as a cost/per/module, (ii) corrections to typos, and refining text for reading flow.
  • psfgrantXAYA_ThinkingWithData_20041013.pdf: Small Revisions to PSF Proposal for Readability
Deleted:
<
<

Some Preliminary Concepts Explained Visually


Deleted:
<
<
Based on feedback on this site from some friends who are not primarily programmers, it was suggested I create some visual introductions to the key concepts which assume no background in programming jargon to begin with:
  • Representing a Model as a Graph ("thinking with graphs)
  • Analysis Patterns (see "Analysis Patterns, Reusable Object Model's" by Martin Fowler, and more recent material on his website at http://www.martinfowler.com/articles.html )
  • What does querying, declarative programming style, mini-language mean to programmers?

Deleted:
<
<
The idea of "thinking with graphs" is illustrated in:

Changed:
<
<
Over the next week, I'll develop two more visual intro's to the other two topics.
>
>

-- MishtuBanerjee - 13 Oct 2004

Changed:
<
<

XAYAvision

>
>

XAYAvision

In A Nutshell. XAYA is to be a Pythonic logic-based mini- language for thinking with data, and this site is to document the creation of XAYA as an extended tutorial of “Thinking with Data in Python”. Thinking with Data is a trial-and-error process of exploration of relationships amongst data (in this case “data” can mean everything from numerical observations, to semantic ontologies). One asks questions (queries), one gets answers (query result sets). Is the answer what you expected? No? Revise, and repeat. A great deal of planning, “business intelligence” and analysis work in various industries requires just such “thinking with data”, where questions become iteratively refined based on previous results. One of XAYA’s key goals is to provide a framework that can introduce Python to that business environment, and in particular to planners and analysts who are not primarily programmers, but who are required to reason with complex data and make valid inferences.


Changed:
<
<
Freedom, Simplicity, Connection. These three words summarize the spirit behind the vision for Xaya. Xaya is centred around the idea of representing information as graphs, and being able to "navigate" through those graphs. This idea originated in the insight that relational systems (such as databases) can be modelled as graphs, and a query in such a system represents a "path through a graph". An earlier version of this idea was inplemented in LifeLine an application that focussed on representing databases as "tree-like" graphs, and allowing the user to query a database in fairly sophisticated ways without requiring knowledge of the underlying database structure, SQL, etc. At it's centre was a query engine that translates between what the user has selected as a "scenario", a graph representation of the data, and the SQL needed to obtain the "scenario" from a relational database.
>
>

Freedom, Simplicity, Connection. These three words summarize the spirit behind the vision for XAYA. XAYA is centred around the idea of representing information as graphs, and being able to "navigate" through those graphs. This idea originated in the insight that relational systems (such as databases) can be modelled as graphs, and a query in such a system represents a "path through a graph".


XAYA takes that basic idea "a query is a path through a graph" and generalizes it so that the query engine no longer is concerned about a particular storage format (such as relational databases, XML, RDF etc). One should focus on the information, and it's inherent relationships, rather than underlying format. In this context, we want to try and apply a set of design principles that are focussed on Logic, Abstraction, Graphs, Language.

Added:
>
>
An earlier version of this idea was implemented in LifeLine an application that focussed on representing databases as "tree-like" graphs, and allowed the user to query a database in fairly sophisticated ways without requiring knowledge of the underlying database structure, SQL, etc. At it's centre was a query engine that translates between what the user has selected as a "scenario", a graph representation of the data, and the SQL needed to obtain the "scenario" from a relational database.

A Language for thinking with data. Data is everywhere, we try and make sense of it. But we keep getting caught up in the details. Here's a brief overview of the details we end up worrying about when thinking about data (as a prelude to thinking with data)

Line: 61 to 59

Point one to three all concern some of the challenges of navigating through information, in search of meaning. Navigating through information is like the early days of the automobile -- where one has to be half-engineer, simply to drive. XAYA's goal is to support the idea of "just driving" and focussing on your understanding of information in a particular field (be it an area of business, science, history, etc). As such XAYA focusses on allowing you to model your knowledge, and link it to existing data, and supports exploratory reasoning about relationships amongst information, in short, "Thinking with Data". More formally, "Logic".

Deleted:
<
<
XAYA is to be a Pythonic logic-based mini- language for thinking with data, and to document the creation of that language as an extended tutorial of “Thinking with Data in Python”. Thinking with Data is a trial-and-error process of exploration of relationships amongst data (in this case “data” can mean everything from numerical observations, to semantic ontologies). One asks questions (queries), one gets answers (query result sets). Is the answer what you expected? No? Revise, and repeat. A great deal of planning, “business intelligence” and analysis work in various industries requires just such “thinking with data”, where questions become iteratively refined based on previous results. One of XAYA’s key goals is to provide a framework that can introduce Python to that business environment, and in particular to planners and analysts who are not primarily programmers, but who are required to reason with complex data and make valid inferences.

Deleted:
<
<
XAYA is built up from three elements:
  • A "miniature language" for representing and reasoning with graphs. There are many excellent graph packages in Python -- and the goal is not to re-invent them, but to focus on having a set of primitives from which we can deal with a wide range of data as graphs. In particular, we want to build up the mini-language so it can support data in many formats in a uniform way. Achieving this, sets us up for the second step. Developing models of data via Analysis Patterns.
  • A template for Analysis Patterns. Analysis Patterns are models of a specific area of knowledge. A Template for analysis patterns is a fill-in template that allows you to define a model for your particular area of knowledge. In particular, you want to identify the "objects" of your knowledge, their "relationships" and their "rules". The goal is to develop a template easy enough that non-programmers can use it to model their area of expertise. Given a uniform mini-language for representing data, and a template for modelling data and it's relationships, we are ready to begin to ask questions of data.
  • A query engine for "finding paths through graphs". Given a mini-language for dealing with graphs, and a template for developing models in our system, we need a way to ask increasingly sophisticated questions. At the simplest level, this means supporting "finding paths through graphs". At higher levels of sophistication, the act of ffinding paths trhough graphs gradually becomes the basis of querying, and at an even more sophisticated level, a basis for "declarative programming" which focusses on defining "what" the solution is, while allowing an automated system (the query or inference engine) to figure out how to satisfy them. For a capsule summary of declarative programming, and comparison to some other styles, go to Wikipedia

Added:
>
>
XAYA is built up from three elements:

  • A "miniature language" for representing and reasoning with graphs. There are many excellent graph packages in Python -- and the goal is not to re-invent them, but to focus on having a set of primitives from which we can deal with a wide range of data as graphs. In particular, we want to build up the mini-language so it can support data in many formats in a uniform way. Achieving this, sets us up for the second step. Developing models of data via Analysis Patterns.

Added:
>
>
  • A template for Analysis Patterns. Analysis Patterns are models of a specific area of knowledge. A Template for analysis patterns is a fill-in template that allows you to define a model for your particular area of knowledge. In particular, you want to identify the "objects" of your knowledge, their "relationships" and their "rules". The goal is to develop a template easy enough that non-programmers can use it to model their area of expertise. Given a uniform mini-language for representing data, and a template for modelling data and it's relationships, we are ready to begin to ask questions of data.

Added:
>
>
  • A query engine for "finding paths through graphs". Given a mini-language for dealing with graphs, and a template for developing models in our system, we need a way to ask increasingly sophisticated questions. At the simplest level, this means supporting "finding paths through graphs". At higher levels of sophistication, the act of ffinding paths trhough graphs gradually becomes the basis of querying, and at an even more sophisticated level, a basis for "declarative programming" which focusses on defining "what" the solution is, while allowing an automated system (the query or inference engine) to figure out how to satisfy them. For a capsule summary of declarative programming, and comparison to some other styles, go to Wikipedia

Deleted:
<
<

XAYAconcepts


Deleted:
<
<
Write up on October 1, 2004

Deleted:
<
<
Ascendency -- Original biological motivation -- from study of ecosystems as "networks of interaction". Bob Ulanowicz's information theoretic framework for quantifying ecosystem dynamics. Realization same ideas could be transferred over from biology networks to "computer networks".

Changed:
<
<
GAL -- Boolean, Nested Boolean, Predicate Logic as graphs.
>
>

XAYAconcepts


Changed:
<
<
Query Languages and Logic Programming
>
>
Based on feedback on this site from some friends who are not primarily programmers, it was suggested I create some visual introductions to the key concepts which assume no background in programming jargon to begin with:
  • Representing a Model as a Graph ("thinking with graphs)
  • Analysis Patterns (see "Analysis Patterns, Reusable Object Model's" by Martin Fowler, and more recent material on his website at http://www.martinfowler.com/articles.html )
  • What does querying, declarative programming style, mini-language mean to programmers?

The idea of "thinking with graphs" is illustrated in:

Over the next week, I'll develop two more visual intro's to the other two topics.


Changed:
<
<
So Called Rules Engines
>
>
A more detailed discussion will be placed later at XAYAConceptsExplained

Deleted:
<
<
Extended Discussion of Analysis Patterns, and idea of Analysis Meta-Patterns.

Changed:
<
<

XAYAcore Iteration001 Notes

>
>

XAYAcore Iteration001 Notes


These notes are transcribed from mb's XAYAjournal, and will be changed as the first implementation comes into being. Right now, I'm just getting the notes down as quickly and roughly as possible, so that both Tyler and I can use them. Based on the specification, we'll both build as many functions as interest us, and compare notes. Others viewing this site are also welcome to play, and post their work here

Line: 97 to 101

XAYAcore???: "The simplest relational network that could possibly work???"

Changed:
<
<

XAYAPrinciples

>
>

XAYAPrinciples


  1. Simplest representation that could possibly work.
  2. A functional programming style (while we are not Lisping, we are reasoning with data)
  3. Focus on logic (and design contracts)
Line: 106 to 110

  1. Use generators and iterators idioms for graph traversal (use as few and as consistent constructs as possible)The following article from Dave Mertz' "Charming Python" series provides a nice intro to these new(ish) Python features.
  2. Have fun. As the 7th principle, it is of prime importance wink (ouch)
Changed:
<
<

XAYAUseCase

>
>

XAYAUseCase


Changed:
<
<
A minimal Use Case to begin with, illustrating a command line session:
>
>
A minimal Use Case to begin with, illustrating a command line session:

  1. User provides an unordered list of selected nodes (representing data in db, ascii file, xml doc, etc)
  2. User is provided with a choice of paths
  3. User selects the path they want
  4. Path is translated to a query in the database (initially, "database" is a simple YAML like text-file format. Later, concept of "database" expanded generically -- could be XML, ascii csv text, an OWL Ontology, relational database, etc ).
Changed:
<
<
Rough Procedure to support the Use Case
>
>
Rough Procedure to support the Use Case

  1. List of unordered nodes are input to a function.
  2. Function calculates path to root(s) for each node
  3. List of Path is output from Function
Line: 128 to 132

Note2 -- in terms of graph operations, this use case is assuming acyclic graphs (forests), rather than trees. So must deal with cases where there are multiple parents for a node, and hence possibility that for any 2 nodes, there can be > 1 path connecting them.

Changed:
<
<
Necessary Simplifications for first Iteration
>
>
Necessary Simplifications for first Iteration

These simplifications are to restrict the scope of the use case

Line: 140 to 144

These simplifications make input, output of data to/from files fairly unproblematic, and simplify the nature of the graph traversal algorithms needed to find the path. Once we've got system working with these restrictions, can then begin to loosen them, and develop more general case of graph traversal algorithms.

Changed:
<
<
Minimal Scope Simplified Use Case
>
>
Minimal Scope Simplified Use Case

  1. Read in a graph from an Ascii file
  2. Generate a NodeIndex? from the graph. Write it to a file as another graph in the Ascii format.
Line: 153 to 157

  1. Every node in the "parent" graph is actually another graph name in the childs. (i.e. for every key in the parent, there is a graph).
  2. Various other patterns, which represent all possible combinations of a 2D matrix of the following triple representing our graph interpretation of a python dictionary: GraphName?, KeyName?, ValueList?.
Changed:
<
<

XAYASubjectDataGraphs.

>
>

XAYASubjectDataGraphs.


What Does the Graph Mean? 8 Interpretations of DictionaryAsGraphs? to define a Knowledge Management System.

Line: 183 to 187

XAYA convention will be to name data using the prefix of its interpretation first 4-5 letters. For example: dataTblPolygons, objeStand, measTblPolygons, modelInventory

Changed:
<
<
XayaSubjectDataGraphsExample?
>
>
XayaSubjectDataGraphsExample

To make this a little less abstract, a small Forest Inventory based example has been created. The example graphs are stored in the attached file "devcore_20040826.zip" which you can download here:

Line: 191 to 195

Below follows an explanation of each graph, and the logical process by which we are working down from a high level Ontology to low level data models (or working up in reverse direction for that matter .... or working from the middle up and down, to be most accurate as to how it actually works in practice ... ). For those familiar with the history of data modelling that led to LifeLine ... the modelling process below can be called "Return of the Long Model" (prequel to The Database strikes back).

Changed:
<
<
XAYAFormat
>
>
XAYAFormat

Data is in text files in a format that can easily be converted to Python Dictionary structures where each line of a file has the following form:

Line: 207 to 211

There will have to be fn's to inter-convert from the text format, DB tables, XML docs, Python Dictionaries. Python Dictionaries form the universal translater intermediate form for going between all other forms (ala Safe Software's appraoch to Semantic Translation). The whole thing made language independant via XML-RPC (which can handle dictionaries as structs).

Changed:
<
<
Project Level
>
>
Project Level

Contained in the following files:

Line: 217 to 221

Graphs_X001.txt
  • This will be an autogenerated file which uses DataFiles?_X001.txt to create a graph where the keys are file names, and the values are all keys in a particular file (i.e. the node list).
Changed:
<
<
Semantics/Ontology Level
>
>
Semantics/Ontology Level

Contained in the following files:

Line: 230 to 234

  • Each tuple represents a "predicate" (verb) , "object" pair. As in This Subject is/contains/has/pick-yer-verb Object.
  • Format here is more complex: Key : (verbA, Object1), (verbB, Object2) ....
Changed:
<
<
Structural level
>
>
Structural level

This is the part that corresponds to traditional data modelling in databases. These files are fairly close approximations of the LifeLine long model. There are three parrallel files, which have the same key-set, but whose values are different forms of meta-data. The structural level could be said to deal with "collections" and is above the level of individual instances of data. It corresponds to Relational Modelling, as it is usually understood.

Line: 263 to 267

  • Values are files in the model.
  • In this case, I included mainly the "structural" and data files in the model. I could have added (and probably should have) all the semantic files.
Changed:
<
<
Mapping Semantics To Structures.
>
>
Mapping Semantics To Structures.

The object level, can be considered akin to abstract classes -- devoid of implementation. The Structural level, deals with the implementation as table, table-like structues (or alternatively in XML world, XML-like, or Schema conformant structures).

Line: 277 to 281

  • Keys are Object Attribute Names
  • Values are the corresponding Measure names (which may or may not be the same).
Changed:
<
<
Data Instance Level
>
>
Data Instance Level

The 11 graphs above are constant in form, and interpretation -- what changes is their contents -- which depend on the actual data (instances) being modelled at a low level). One could say instance information flows through the Structural and Semantic levels, which do not change in form.

Line: 306 to 310

Changed:
<
<

XAYAPredicateFunctions.

>
>

XAYAPredicateFunctions.


Functions that tell you something about a graph, a node, an edge. Currently there are twelve functions. If some of the functions can be defined in terms of others, we can reduce this list to a set of axiomatic functions.

Line: 358 to 362

In addition to the general functions defined above, additional domain specific functions may be defined, using the general functions as their components. So, we are building "from the bottom up" several layers of language. For more on this, see the general article by Paul Graham on Programming Bottom Up
Changed:
<
<

XayaCoreSpecification

>
>

XayaCoreSpecification


Finally ..... The XAYA001 specification

Line: 393 to 397

15. killEdge(graphName, NodeName?)
Changed:
<
<
Some More Rough Algorithms
>
>
Some More Rough Algorithms

.... To be continued, as the implementation is worked out.

Line: 402 to 406

Node Index Generation To generate an index such as 1.1, 1.2.1, etc, defining node locations in the graph. Once such an index is generated -- operations on it, can be easier than full graph-walking algorithms (which don't make any assumptions about the graph's structure. In particular, regular expressions on the lists of node indexes can be used to find the paths.

Changed:
<
<

XAYAObjects

>
>

XAYAObjects


Currently in the first iteration there are no "official" objects. Though obviously we are using some object oriented/relational hybrid thinking in our modelling approach. Ultimately, we may create an object such as DiGraph?(dict), inheriting from the built-in dictionary type. This is a bit different than some other approaches which usually proceed by defining node, edge classes, and several types of graphs: graph, digraph, tree. This leads to a menagery of Graph related objects.

Line: 419 to 423

The current functions then become methods within this class. And subclassing can over-ride what they do, and add further methods.

Changed:
<
<

XAYALanguage

>
>

XAYALanguage


So far, we've defined the mini-language as a library in Python (i.e. we're using Python itself as a little language, in the manner illustrated in the Python Cookbook, pg 449, which also contains yet another graph implementation example). The next step, is to go further and actually extend the Python language. The notes here are some very preliminary ideas as to what such an extended language might look like.

Line: 471 to 475

Changed:
<
<

XAYAcode

>
>

XAYAcode


These are some initial code, developed to illustrate ideas for XAYA in its initial iteration. These should not be taken as the final code, for the first iteration, but really a working out of code conventions and initial command line interface via proto-typing. Expect the code to be updated every few days. Top link, is most recent version of code (so one gets a running view of the developing style). Current conventions are to use a test-first development style (define unit tests, write code)via unittest module, a pseudo-literate coding style that provides documentation of any algorithms used, and references to code sources, and finally incorporation of usage examples via docTest module. Code is currently more verbose than it needs to be, and will get stripped down. Try and balance conciseness and clarity. (emphasis on "try").

Line: 510 to 514


Changed:
<
<

XAYAThinkingWithData

>
>


xaya_wide_cmyk_lg.jpg

 <<O>>  Difference Topic XAYAThinkingWithData (r1.9 - 20 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYAThinkingWithData

Line: 46 to 46

XAYAvision

Added:
>
>
Freedom, Simplicity, Connection. These three words summarize the spirit behind the vision for Xaya. Xaya is centred around the idea of representing information as graphs, and being able to "navigate" through those graphs. This idea originated in the insight that relational systems (such as databases) can be modelled as graphs, and a query in such a system represents a "path through a graph". An earlier version of this idea was inplemented in LifeLine an application that focussed on representing databases as "tree-like" graphs, and allowing the user to query a database in fairly sophisticated ways without requiring knowledge of the underlying database structure, SQL, etc. At it's centre was a query engine that translates between what the user has selected as a "scenario", a graph representation of the data, and the SQL needed to obtain the "scenario" from a relational database.

Added:
>
>
XAYA takes that basic idea "a query is a path through a graph" and generalizes it so that the query engine no longer is concerned about a particular storage format (such as relational databases, XML, RDF etc). One should focus on the information, and it's inherent relationships, rather than underlying format. In this context, we want to try and apply a set of design principles that are focussed on Logic, Abstraction, Graphs, Language.

Deleted:
<
<
A Language for thinking with data. Data is everywhere, we try and make sense of it. But we keep getting caught up in the details. Is the data in a database? Is it in XML format? Is there a DTD, a Schema? Which query mechanism do we use? Navigating through information is like the early days of the automobile -- where one has to be half-engineer, simply to drive. XAYA's goal is to support the idea of "just driving" and focussing on your understanding of information in a particular field (be it an area of business, science, history, etc). As such it supports allowing you to model your knowledge, and link it to existing data, and supports exploratory reasoning about relationships amongst information, in short, "Thinking with Data". More formally, "Logic".

Added:
>
>
A Language for thinking with data. Data is everywhere, we try and make sense of it. But we keep getting caught up in the details. Here's a brief overview of the details we end up worrying about when thinking about data (as a prelude to thinking with data)

Changed:
<
<
Freedom, Simplicity, Connection. These three words summarize the spirit behind the vision for Xaya. Xaya is centred around the idea of representing information as graphs, and being able to "navigate" through those graphs. This idea originated in the insight that relational systems (such as databases) can be modelled as graphs, and a query in such a system represents a "path through a graph". An earlier version of this idea was inplemented in LifeLine an application that focussed on representing databases as "tree-like" graphs, and allowing the user to query a database in fairly sophisticated ways without requiring knowledge of the underlying database structure, SQL, etc. At it's centre was a query engine that translates between what the user has selected as a "scenario", a graph representation of the data, and the SQL needed to obtain the "scenario" from a relational database.
>
>
First we have to worry about "what format the data is in, and how to access it". Is the data in a database? Is it in XML format? Is there a DTD, a Schema?

Second, we have to worry about "how do we query this data?". Which query mechanism do we use? How do we formulate the query? How do we query, if the relevant data is in several data sets?

Third, it gets even more complicated. All the data we think is relevant might not be in a single database. It might not all be in a single location. How could we connect and consolidate multiple sources of data that are about "the same things".

Point one to three all concern some of the challenges of navigating through information, in search of meaning. Navigating through information is like the early days of the automobile -- where one has to be half-engineer, simply to drive. XAYA's goal is to support the idea of "just driving" and focussing on your understanding of information in a particular field (be it an area of business, science, history, etc). As such XAYA focusses on allowing you to model your knowledge, and link it to existing data, and supports exploratory reasoning about relationships amongst information, in short, "Thinking with Data". More formally, "Logic".

XAYA is to be a Pythonic logic-based mini- language for thinking with data, and to document the creation of that language as an extended tutorial of “Thinking with Data in Python”. Thinking with Data is a trial-and-error process of exploration of relationships amongst data (in this case “data” can mean everything from numerical observations, to semantic ontologies). One asks questions (queries), one gets answers (query result sets). Is the answer what you expected? No? Revise, and repeat. A great deal of planning, “business intelligence” and analysis work in various industries requires just such “thinking with data”, where questions become iteratively refined based on previous results. One of XAYA’s key goals is to provide a framework that can introduce Python to that business environment, and in particular to planners and analysts who are not primarily programmers, but who are required to reason with complex data and make valid inferences.


Changed:
<
<
XAYA takes that basic idea "a query is a path through a graph" and generalizes it so that the query engine no longer is concerned about a particular storage format (such as relational databases, XML, RDF etc). One should focus on the information, and it's inherent relationships, rather than underlying format. In this context, we want to try and apply a set of design principles that are focussed on Logic, Abstraction, Graphs, Language. See XayaGal? (l = logic and/or language so Graph Abstraction Logic or Graph Abstraction language).
>
>
XAYA is built up from three elements:
  • A "miniature language" for representing and reasoning with graphs. There are many excellent graph packages in Python -- and the goal is not to re-invent them, but to focus on having a set of primitives from which we can deal with a wide range of data as graphs. In particular, we want to build up the mini-language so it can support data in many formats in a uniform way. Achieving this, sets us up for the second step. Developing models of data via Analysis Patterns.
  • A template for Analysis Patterns. Analysis Patterns are models of a specific area of knowledge. A Template for analysis patterns is a fill-in template that allows you to define a model for your particular area of knowledge. In particular, you want to identify the "objects" of your knowledge, their "relationships" and their "rules". The goal is to develop a template easy enough that non-programmers can use it to model their area of expertise. Given a uniform mini-language for representing data, and a template for modelling data and it's relationships, we are ready to begin to ask questions of data.
  • A query engine for "finding paths through graphs". Given a mini-language for dealing with graphs, and a template for developing models in our system, we need a way to ask increasingly sophisticated questions. At the simplest level, this means supporting "finding paths through graphs". At higher levels of sophistication, the act of ffinding paths trhough graphs gradually becomes the basis of querying, and at an even more sophisticated level, a basis for "declarative programming" which focusses on defining "what" the solution is, while allowing an automated system (the query or inference engine) to figure out how to satisfy them. For a capsule summary of declarative programming, and comparison to some other styles, go to Wikipedia

Deleted:
<
<
The basic vision of Xaya is as a modular system that follows the Unix Design Principles, and Coheres to the core Modular Operators

Deleted:
<
<
Unix Design principles:
  • Build Fns that do One thing only
  • Build fns that work with each other
  • "Use text streams because it is a universal interface". In our case, the uinversal interface witll be a graph-representation. This is similar to the idea that in an RDB, the uinversal interface is a table. And all operations on tables (Select, joins, constraints) themselves result in tables.

Modularity Design Principles (Modularity Operators)

  1. Splitting: Breaking an interdependant module into 2 independant modules.
  2. Substitution: Replacing a module with another implementation that is functionally equiavlent.
  3. Subtraction : Getting rid of some modulees, without breaking the system
  4. Augmentation : Adding new functionality through new modules without breaking the system.
  5. Inversion : Taking a low level "hidden" modules and brining it to a higher level and making it visible.
  6. Porting : Being able to move a set of modules to a new environment and have them still function

 <<O>>  Difference Topic XAYAThinkingWithData (r1.8 - 16 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYAThinkingWithData

Line: 32 to 32

Based on feedback on this site from some friends who are not primarily programmers, it was suggested I create some visual introductions to the key concepts which assume no background in programming jargon to begin with:

  • Representing a Model as a Graph ("thinking with graphs)
Changed:
<
<
  • Analysis Patterns
>
>

  • What does querying, declarative programming style, mini-language mean to programmers?

The idea of "thinking with graphs" is illustrated in:

 <<O>>  Difference Topic XAYAThinkingWithData (r1.7 - 13 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYAThinkingWithData

Line: 21 to 21

This site has been set up to support a Python Software Foundation grant proposal, and contains a basic description of XAYA's goals, philosophy, concepts, proposed methods, and some early code examples (code's going to be fairly thoroughly reworked this weekend -- mb Sept 30, 2004).

Added:
>
>
Original PSF Grant Application

Added:
>
>
Revision to PSF Grant Application
  • Based on feedback recieved from some friends after I submitted the original proposal, I did a slight revision to (i) add more discussion on Analysis Patterns and (ii) clarify the payment plan, as a cost/per/module, (ii) corrections to typos, and refining text for reading flow.
  • psfgrantXAYA_ThinkingWithData_20041013.pdf: Small Revisions to PSF Proposal for Readability

Added:
>
>

Some Preliminary Concepts Explained Visually


Added:
>
>
Based on feedback on this site from some friends who are not primarily programmers, it was suggested I create some visual introductions to the key concepts which assume no background in programming jargon to begin with:
  • Representing a Model as a Graph ("thinking with graphs)
  • Analysis Patterns
  • What does querying, declarative programming style, mini-language mean to programmers?

The idea of "thinking with graphs" is illustrated in:

Over the next week, I'll develop two more visual intro's to the other two topics.

-- MishtuBanerjee - 13 Oct 2004


XAYAvision

Line: 458 to 476

These are some initial code, developed to illustrate ideas for XAYA in its initial iteration. These should not be taken as the final code, for the first iteration, but really a working out of code conventions and initial command line interface via proto-typing. Expect the code to be updated every few days. Top link, is most recent version of code (so one gets a running view of the developing style). Current conventions are to use a test-first development style (define unit tests, write code)via unittest module, a pseudo-literate coding style that provides documentation of any algorithms used, and references to code sources, and finally incorporation of usage examples via docTest module. Code is currently more verbose than it needs to be, and will get stripped down. Try and balance conciseness and clarity. (emphasis on "try").
Added:
>
>

Line: 493 to 513

XAYAThinkingWithData


xaya_wide_cmyk_lg.jpg

Added:
>
>


META FILEATTACHMENT devcore_20040826.zip attr="" comment="Prototypes as of August 26, 2004" date="1096586045" path="C:\XAYA\devcore_20040826.zip" size="8585" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040914.zip attr="" comment="Prototypes as of September 14, 2004" date="1096586069" path="C:\XAYA\devcore_20040914.zip" size="21282" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040921.zip attr="" comment="Prototypes as of September 21, 2004" date="1096586105" path="C:\XAYA\devcore_20040921.zip" size="146169" user="utsim" version="1.1"
Line: 500 to 523

META FILEATTACHMENT XAYcoreUsageExample?_20040920.doc attr="" comment="Command Line Usage Example %_Q_%Bind%_Q_%" date="1096617281" path="C:\XAYA\devcore\XAYcoreUsageExample_20040920.doc" size="88064" user="utsim" version="1.1"
META FILEATTACHMENT xaya_wide_cmyk_lg.jpg attr="" comment="Xaya Logo -- wide" date="1096618068" path="C:\Data\Biznet\Harmeny\Harmeny_OpenSourceBiz\XayaFinal\bitmap\cmyk\xaya_wide_cmyk_lg.jpg" size="51545" user="utsim" version="1.1"
META FILEATTACHMENT CreativeCommonsLogosomerights20?.gif attr="" comment="" date="1096778211" path="C:\Data\Biznet\Harmeny\Harmeny_OpenSourceBiz\CreativeCommonsLogosomerights20.gif" size="1835" user="utsim" version="1.1"
Added:
>
>
META FILEATTACHMENT psfgrantXAYA_ThinkingWithData_20041013.pdf attr="" comment="Small Revisions to PSF Proposal for Readability" date="1097704940" path="C:\XAYA\XAYAFunding\psfgrantXAYA_ThinkingWithData_20041013.pdf" size="402780" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20041011.zip attr="" comment="Prototypes as of October 11, 2004" date="1097705124" path="C:\XAYA\devcore_20041011.zip" size="225303" user="utsim" version="1.1"
META FILEATTACHMENT XAYcoreUsageExample?_20041006.doc attr="" comment="Command Line Usage Example %_Q_%findAllPaths%_Q_%" date="1097705357" path="C:\XAYA\XAYAextension\UsageExamples\XAYcoreUsageExample_20041006.doc" size="75264" user="utsim" version="1.1"
META FILEATTACHMENT ThinkingWithGraphs?_20041011.pdf attr="" comment="A visual introduction to thinking with graphs" date="1097705990" path="C:\XAYA\XAYAextension\ThinkingWithGraphs_20041011.pdf" size="740604" user="utsim" version="1.1"
 <<O>>  Difference Topic XAYAThinkingWithData (r1.6 - 05 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome
Changed:
<
<

xaya_wide_cmyk_lg.jpg
>
>

xaya_wide_cmyk_lg.jpg

XAYAThinkingWithData

Line: 491 to 491


XAYAThinkingWithData

Changed:
<
<

xaya_wide_cmyk_lg.jpg
>
>

xaya_wide_cmyk_lg.jpg

META FILEATTACHMENT devcore_20040826.zip attr="" comment="Prototypes as of August 26, 2004" date="1096586045" path="C:\XAYA\devcore_20040826.zip" size="8585" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040914.zip attr="" comment="Prototypes as of September 14, 2004" date="1096586069" path="C:\XAYA\devcore_20040914.zip" size="21282" user="utsim" version="1.1"
 <<O>>  Difference Topic XAYAThinkingWithData (r1.5 - 03 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

xaya_wide_cmyk_lg.jpg

XAYAThinkingWithData

Line: 58 to 58

Write up on October 1, 2004

Added:
>
>
Ascendency -- Original biological motivation -- from study of ecosystems as "networks of interaction". Bob Ulanowicz's information theoretic framework for quantifying ecosystem dynamics. Realization same ideas could be transferred over from biology networks to "computer networks".

GAL -- Boolean, Nested Boolean, Predicate Logic as graphs.

Query Languages and Logic Programming

Line: 106 to 108

Note1 that we go from an unordered set of initial nodes as our input, to a list of ordered nodes as our final output (the path). I'm glossing what the translater has to do, so I can concentrate on the graph logic part.

Changed:
<
<
Note2 -- in terms of graph operations, this use case is more general than LifeLine, since it is assuming acyclic graphs (forests), rather than trees.
>
>
Note2 -- in terms of graph operations, this use case is assuming acyclic graphs (forests), rather than trees. So must deal with cases where there are multiple parents for a node, and hence possibility that for any 2 nodes, there can be > 1 path connecting them.

Necessary Simplifications for first Iteration
Line: 468 to 470


Deleted:
<
<
* psfgrantXAYA_ThinkingWithData_20040930.pdf: PSF Grant Proposal: XAYA Thinking With Data

Line: 490 to 490


Changed:
<
<


xaya_wide_cmyk_lg.jpg

>
>

XAYAThinkingWithData


xaya_wide_cmyk_lg.jpg

META FILEATTACHMENT devcore_20040826.zip attr="" comment="Prototypes as of August 26, 2004" date="1096586045" path="C:\XAYA\devcore_20040826.zip" size="8585" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040914.zip attr="" comment="Prototypes as of September 14, 2004" date="1096586069" path="C:\XAYA\devcore_20040914.zip" size="21282" user="utsim" version="1.1"
 <<O>>  Difference Topic XAYAThinkingWithData (r1.4 - 03 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome
Changed:
<
<

xaya_wide_cmyk_lg.jpg
>
>

xaya_wide_cmyk_lg.jpg

XAYAThinkingWithData

Line: 478 to 478

Revised -- MishtuBanerjee - 30 Sep 2004

Added:
>
>


CreativeCommonsLogosomerights20.gif

All trademarks and copyrights on this page are owned by their respective owners.

Comments are owned by the individual posters.

This work is licensed under the Creative Commons License -- Attribution Share Alike 2.0.




xaya_wide_cmyk_lg.jpg

Added:
>
>


META FILEATTACHMENT devcore_20040826.zip attr="" comment="Prototypes as of August 26, 2004" date="1096586045" path="C:\XAYA\devcore_20040826.zip" size="8585" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040914.zip attr="" comment="Prototypes as of September 14, 2004" date="1096586069" path="C:\XAYA\devcore_20040914.zip" size="21282" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040921.zip attr="" comment="Prototypes as of September 21, 2004" date="1096586105" path="C:\XAYA\devcore_20040921.zip" size="146169" user="utsim" version="1.1"
META FILEATTACHMENT psfgrantXAYA_ThinkingWithData_20040930.pdf attr="" comment="PSF Grant Proposal: XAYA Thinking With Data" date="1096599921" path="C:\XAYA\XAYAFunding\psfgrantXAYA_ThinkingWithData_20040930.pdf" size="349444" user="utsim" version="1.1"
META FILEATTACHMENT XAYcoreUsageExample?_20040920.doc attr="" comment="Command Line Usage Example %_Q_%Bind%_Q_%" date="1096617281" path="C:\XAYA\devcore\XAYcoreUsageExample_20040920.doc" size="88064" user="utsim" version="1.1"
META FILEATTACHMENT xaya_wide_cmyk_lg.jpg attr="" comment="Xaya Logo -- wide" date="1096618068" path="C:\Data\Biznet\Harmeny\Harmeny_OpenSourceBiz\XayaFinal\bitmap\cmyk\xaya_wide_cmyk_lg.jpg" size="51545" user="utsim" version="1.1"
Added:
>
>
META FILEATTACHMENT CreativeCommonsLogosomerights20?.gif attr="" comment="" date="1096778211" path="C:\Data\Biznet\Harmeny\Harmeny_OpenSourceBiz\CreativeCommonsLogosomerights20.gif" size="1835" user="utsim" version="1.1"
 <<O>>  Difference Topic XAYAThinkingWithData (r1.3 - 01 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome
Added:
>
>

xaya_wide_cmyk_lg.jpg

XAYAThinkingWithData

Deleted:
<
<
XAYAcore: "A Language for Thinking with Data"

Changed:
<
<
XAYAvision: "Freedom Simplicity Connection"
>
>

XAYAcore: "A Language for Thinking with Data"

XAYAvision: "Freedom Simplicity Connection"


Line: 17 to 26

Changed:
<
<

XAYAvision

>
>

XAYAvision


Line: 44 to 53

  1. Porting : Being able to move a set of modules to a new environment and have them still function
Deleted:
<
<
Imagine Xaya as a set of concentric circles (ripples in a pond).

Changed:
<
<
  • at the centre is:
    • XayaGal? -- Graph Abstraction Logic : Design Principles and Meta Data (think Posix design priniples, TCP/IP, .... general protocols or standards)
    • XayaCore? -- reference implementation of XayaGal? in a particular language. In our case -- Python. Someone else could implement the equivalent in another language. or even in Python have a version that is Internally different, but equivalent in its operations.
>
>

XAYAconcepts

Write up on October 1, 2004

GAL -- Boolean, Nested Boolean, Predicate Logic as graphs.


Added:
>
>
Query Languages and Logic Programming

Changed:
<
<
  • In the second ring, based on XayaGal? and XayaCore? are the specific applications
    • XayaGeo? -- embedding XayaCore? in OpenEV?. Similar embedding could occur in JUMP with a Java port of XayaCore? via Jython (or in MapServer?, or in via HTML. Can basically build XayaGeo? at same time as XayaCore?, and thereby test that it is embeddable in other applications.
    • XayaApp? -- Adding back in the GUI of LifeLine, but now in this new core. Since the new core is not restricted to Trees any longer, there will have to be some re-thinking of the GUI/UserInteractions. Target: Go from first iteration of XayaGal?/XayaCore to XayaApp? in 6 months (by year end).
    • Xayaflow -- Distributed Information system central to the GridAgent? / GetSmart? network Dave Cohen, Tyler Mitchell and I have discussed over the last year. Central concept is a distributed model of information flow. See "Information Flow: The Logic of Distributed Systems" by Barwise and Seligman; available on Amazon at http://www.amazon.ca/exec/obidos/ASIN/0521583861/qid=1090570111/sr=1-2/ref=sr_1_0_2/702-0647905-1780835
    • XayaQuest? -- Statistical Analysis system, using search algorithms instead of convention statistics. ..... Mb's long term data analysis engine goal.
    • And so on. Other folkds will build other things from XayCore? ....
>
>
So Called Rules Engines

Changed:
<
<
  • Finally in the third ring, there are frameworks.
    • E.g. LifeLine : combination of XayaApp? + custom modules for processing inventory data + custom forecasting models and classification systemss + custom data models
    • E.g. The GetSmart?/ distributed agent system with statistical learning ..... for fault tolerant networks.
    • E.g. : A GIS toolkit based on XAYA embedded in OpenEV?. The idea Tyler and I have been discussing of the GIS data processing flow of: Storage, Manipulation and Visualization, Publishing of information. (The model applies not just to GIS -- but to other areas of analysis, like statistical visualization systems)
>
>
Extended Discussion of Analysis Patterns, and idea of Analysis Meta-Patterns.

Changed:
<
<

XAYAcore Iteration001 Notes

>
>

XAYAcore Iteration001 Notes


These notes are transcribed from mb's XAYAjournal, and will be changed as the first implementation comes into being. Right now, I'm just getting the notes down as quickly and roughly as possible, so that both Tyler and I can use them. Based on the specification, we'll both build as many functions as interest us, and compare notes. Others viewing this site are also welcome to play, and post their work here

Line: 74 to 77

XAYAcore???: "The simplest relational network that could possibly work???"

Changed:
<
<

XayaPrinciples?

>
>

XAYAPrinciples


  1. Simplest representation that could possibly work.
  2. A functional programming style (while we are not Lisping, we are reasoning with data)
  3. Focus on logic (and design contracts)
Line: 83 to 86

  1. Use generators and iterators idioms for graph traversal (use as few and as consistent constructs as possible)The following article from Dave Mertz' "Charming Python" series provides a nice intro to these new(ish) Python features.
  2. Have fun. As the 7th principle, it is of prime importance wink (ouch)
Changed:
<
<

XAYAUseCase?

>
>

XAYAUseCase


A minimal Use Case to begin with, illustrating a command line session:
  1. User provides an unordered list of selected nodes (representing data in db, ascii file, xml doc, etc)
Line: 130 to 133

  1. Every node in the "parent" graph is actually another graph name in the childs. (i.e. for every key in the parent, there is a graph).
  2. Various other patterns, which represent all possible combinations of a 2D matrix of the following triple representing our graph interpretation of a python dictionary: GraphName?, KeyName?, ValueList?.
Changed:
<
<

XAYASubjectDataGraphs?.

>
>

XAYASubjectDataGraphs.


What Does the Graph Mean? 8 Interpretations of DictionaryAsGraphs? to define a Knowledge Management System.

Line: 164 to 167

To make this a little less abstract, a small Forest Inventory based example has been created. The example graphs are stored in the attached file "devcore_20040826.zip" which you can download here:

Changed:
<
<
>
>
  • devcore_20040826.zip: Prototypes as of August 26, 2004 -- First Iteration of XAYA Graph Data Model Process

Below follows an explanation of each graph, and the logical process by which we are working down from a high level Ontology to low level data models (or working up in reverse direction for that matter .... or working from the middle up and down, to be most accurate as to how it actually works in practice ... ). For those familiar with the history of data modelling that led to LifeLine ... the modelling process below can be called "Return of the Long Model" (prequel to The Database strikes back).

Line: 283 to 286

Changed:
<
<

XAYAPredicateFunctions?.

>
>

XAYAPredicateFunctions.


Functions that tell you something about a graph, a node, an edge. Currently there are twelve functions. If some of the functions can be defined in terms of others, we can reduce this list to a set of axiomatic functions.

Line: 335 to 338

In addition to the general functions defined above, additional domain specific functions may be defined, using the general functions as their components. So, we are building "from the bottom up" several layers of language. For more on this, see the general article by Paul Graham on Programming Bottom Up
Changed:
<
<

XayaCoreSpecification?

>
>

XayaCoreSpecification


Finally ..... The XAYA001 specification

Line: 379 to 382

Node Index Generation To generate an index such as 1.1, 1.2.1, etc, defining node locations in the graph. Once such an index is generated -- operations on it, can be easier than full graph-walking algorithms (which don't make any assumptions about the graph's structure. In particular, regular expressions on the lists of node indexes can be used to find the paths.

Changed:
<
<

XAYAObjects

>
>

XAYAObjects


Currently in the first iteration there are no "official" objects. Though obviously we are using some object oriented/relational hybrid thinking in our modelling approach. Ultimately, we may create an object such as DiGraph?(dict), inheriting from the built-in dictionary type. This is a bit different than some other approaches which usually proceed by defining node, edge classes, and several types of graphs: graph, digraph, tree. This leads to a menagery of Graph related objects.

Line: 396 to 399

The current functions then become methods within this class. And subclassing can over-ride what they do, and add further methods.

Changed:
<
<

XAYALanguage

>
>

XAYALanguage


So far, we've defined the mini-language as a library in Python (i.e. we're using Python itself as a little language, in the manner illustrated in the Python Cookbook, pg 449, which also contains yet another graph implementation example). The next step, is to go further and actually extend the Python language. The notes here are some very preliminary ideas as to what such an extended language might look like.

Line: 448 to 451

Changed:
<
<

XAYAcode

>
>

XAYAcode


These are some initial code, developed to illustrate ideas for XAYA in its initial iteration. These should not be taken as the final code, for the first iteration, but really a working out of code conventions and initial command line interface via proto-typing. Expect the code to be updated every few days. Top link, is most recent version of code (so one gets a running view of the developing style). Current conventions are to use a test-first development style (define unit tests, write code)via unittest module, a pseudo-literate coding style that provides documentation of any algorithms used, and references to code sources, and finally incorporation of usage examples via docTest module. Code is currently more verbose than it needs to be, and will get stripped down. Try and balance conciseness and clarity. (emphasis on "try").

Line: 456 to 459

Added:
>
>

Line: 473 to 478

Revised -- MishtuBanerjee - 30 Sep 2004

Added:
>
>


xaya_wide_cmyk_lg.jpg


META FILEATTACHMENT devcore_20040826.zip attr="" comment="Prototypes as of August 26, 2004" date="1096586045" path="C:\XAYA\devcore_20040826.zip" size="8585" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040914.zip attr="" comment="Prototypes as of September 14, 2004" date="1096586069" path="C:\XAYA\devcore_20040914.zip" size="21282" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040921.zip attr="" comment="Prototypes as of September 21, 2004" date="1096586105" path="C:\XAYA\devcore_20040921.zip" size="146169" user="utsim" version="1.1"
META FILEATTACHMENT psfgrantXAYA_ThinkingWithData_20040930.pdf attr="" comment="PSF Grant Proposal: XAYA Thinking With Data" date="1096599921" path="C:\XAYA\XAYAFunding\psfgrantXAYA_ThinkingWithData_20040930.pdf" size="349444" user="utsim" version="1.1"
Added:
>
>
META FILEATTACHMENT XAYcoreUsageExample?_20040920.doc attr="" comment="Command Line Usage Example %_Q_%Bind%_Q_%" date="1096617281" path="C:\XAYA\devcore\XAYcoreUsageExample_20040920.doc" size="88064" user="utsim" version="1.1"
META FILEATTACHMENT xaya_wide_cmyk_lg.jpg attr="" comment="Xaya Logo -- wide" date="1096618068" path="C:\Data\Biznet\Harmeny\Harmeny_OpenSourceBiz\XayaFinal\bitmap\cmyk\xaya_wide_cmyk_lg.jpg" size="51545" user="utsim" version="1.1"
 <<O>>  Difference Topic XAYAThinkingWithData (r1.2 - 01 Oct 2004 - Main.utsim)

META TOPICPARENT WebHome

XAYAThinkingWithData

Line: 10 to 10

Return to WebHome

Changed:
<
<
This site has been set up to support a Python Software Foundation grant proposal, and contains a basic description of XAYA's goals, philosophy, concepts, proposed methods, and some early code examples.
>
>
This site has been set up to support a Python Software Foundation grant proposal, and contains a basic description of XAYA's goals, philosophy, concepts, proposed methods, and some early code examples (code's going to be fairly thoroughly reworked this weekend -- mb Sept 30, 2004).


XAYAvision

Line: 456 to 461

Added:
>
>

* psfgrantXAYA_ThinkingWithData_20040930.pdf: PSF Grant Proposal: XAYA Thinking With Data


Original -- MishtuBanerjee - 16 Aug 2004

Revised -- MishtuBanerjee - 30 Sep 2004

Added:
>
>


META FILEATTACHMENT devcore_20040826.zip attr="" comment="Prototypes as of August 26, 2004" date="1096586045" path="C:\XAYA\devcore_20040826.zip" size="8585" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040914.zip attr="" comment="Prototypes as of September 14, 2004" date="1096586069" path="C:\XAYA\devcore_20040914.zip" size="21282" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040921.zip attr="" comment="Prototypes as of September 21, 2004" date="1096586105" path="C:\XAYA\devcore_20040921.zip" size="146169" user="utsim" version="1.1"
Added:
>
>
META FILEATTACHMENT psfgrantXAYA_ThinkingWithData_20040930.pdf attr="" comment="PSF Grant Proposal: XAYA Thinking With Data" date="1096599921" path="C:\XAYA\XAYAFunding\psfgrantXAYA_ThinkingWithData_20040930.pdf" size="349444" user="utsim" version="1.1"
 <<O>>  Difference Topic XAYAThinkingWithData (r1.1 - 30 Sep 2004 - Main.utsim)
Line: 1 to 1
Added:
>
>
META TOPICPARENT WebHome

XAYAThinkingWithData

XAYAcore: "A Language for Thinking with Data"

XAYAvision: "Freedom Simplicity Connection"

Return to WebHome

This site has been set up to support a Python Software Foundation grant proposal, and contains a basic description of XAYA's goals, philosophy, concepts, proposed methods, and some early code examples.

XAYAvision

A Language for thinking with data. Data is everywhere, we try and make sense of it. But we keep getting caught up in the details. Is the data in a database? Is it in XML format? Is there a DTD, a Schema? Which query mechanism do we use? Navigating through information is like the early days of the automobile -- where one has to be half-engineer, simply to drive. XAYA's goal is to support the idea of "just driving" and focussing on your understanding of information in a particular field (be it an area of business, science, history, etc). As such it supports allowing you to model your knowledge, and link it to existing data, and supports exploratory reasoning about relationships amongst information, in short, "Thinking with Data". More formally, "Logic".

Freedom, Simplicity, Connection. These three words summarize the spirit behind the vision for Xaya. Xaya is centred around the idea of representing information as graphs, and being able to "navigate" through those graphs. This idea originated in the insight that relational systems (such as databases) can be modelled as graphs, and a query in such a system represents a "path through a graph". An earlier version of this idea was inplemented in LifeLine an application that focussed on representing databases as "tree-like" graphs, and allowing the user to query a database in fairly sophisticated ways without requiring knowledge of the underlying database structure, SQL, etc. At it's centre was a query engine that translates between what the user has selected as a "scenario", a graph representation of the data, and the SQL needed to obtain the "scenario" from a relational database.

XAYA takes that basic idea "a query is a path through a graph" and generalizes it so that the query engine no longer is concerned about a particular storage format (such as relational databases, XML, RDF etc). One should focus on the information, and it's inherent relationships, rather than underlying format. In this context, we want to try and apply a set of design principles that are focussed on Logic, Abstraction, Graphs, Language. See XayaGal? (l = logic and/or language so Graph Abstraction Logic or Graph Abstraction language).

The basic vision of Xaya is as a modular system that follows the Unix Design Principles, and Coheres to the core Modular Operators

Unix Design principles:

  • Build Fns that do One thing only
  • Build fns that work with each other
  • "Use text streams because it is a universal interface". In our case, the uinversal interface witll be a graph-representation. This is similar to the idea that in an RDB, the uinversal interface is a table. And all operations on tables (Select, joins, constraints) themselves result in tables.

Modularity Design Principles (Modularity Operators)

  1. Splitting: Breaking an interdependant module into 2 independant modules.
  2. Substitution: Replacing a module with another implementation that is functionally equiavlent.
  3. Subtraction : Getting rid of some modulees, without breaking the system
  4. Augmentation : Adding new functionality through new modules without breaking the system.
  5. Inversion : Taking a low level "hidden" modules and brining it to a higher level and making it visible.
  6. Porting : Being able to move a set of modules to a new environment and have them still function

Imagine Xaya as a set of concentric circles (ripples in a pond).

  • at the centre is:
    • XayaGal? -- Graph Abstraction Logic : Design Principles and Meta Data (think Posix design priniples, TCP/IP, .... general protocols or standards)
    • XayaCore? -- reference implementation of XayaGal? in a particular language. In our case -- Python. Someone else could implement the equivalent in another language. or even in Python have a version that is Internally different, but equivalent in its operations.

  • In the second ring, based on XayaGal? and XayaCore? are the specific applications
    • XayaGeo? -- embedding XayaCore? in OpenEV?. Similar embedding could occur in JUMP with a Java port of XayaCore? via Jython (or in MapServer?, or in via HTML. Can basically build XayaGeo? at same time as XayaCore?, and thereby test that it is embeddable in other applications.
    • XayaApp? -- Adding back in the GUI of LifeLine, but now in this new core. Since the new core is not restricted to Trees any longer, there will have to be some re-thinking of the GUI/UserInteractions. Target: Go from first iteration of XayaGal?/XayaCore to XayaApp? in 6 months (by year end).
    • Xayaflow -- Distributed Information system central to the GridAgent? / GetSmart? network Dave Cohen, Tyler Mitchell and I have discussed over the last year. Central concept is a distributed model of information flow. See "Information Flow: The Logic of Distributed Systems" by Barwise and Seligman; available on Amazon at http://www.amazon.ca/exec/obidos/ASIN/0521583861/qid=1090570111/sr=1-2/ref=sr_1_0_2/702-0647905-1780835
    • XayaQuest? -- Statistical Analysis system, using search algorithms instead of convention statistics. ..... Mb's long term data analysis engine goal.
    • And so on. Other folkds will build other things from XayCore? ....

  • Finally in the third ring, there are frameworks.
    • E.g. LifeLine : combination of XayaApp? + custom modules for processing inventory data + custom forecasting models and classification systemss + custom data models
    • E.g. The GetSmart?/ distributed agent system with statistical learning ..... for fault tolerant networks.
    • E.g. : A GIS toolkit based on XAYA embedded in OpenEV?. The idea Tyler and I have been discussing of the GIS data processing flow of: Storage, Manipulation and Visualization, Publishing of information. (The model applies not just to GIS -- but to other areas of analysis, like statistical visualization systems)

XAYAcore Iteration001 Notes

These notes are transcribed from mb's XAYAjournal, and will be changed as the first implementation comes into being. Right now, I'm just getting the notes down as quickly and roughly as possible, so that both Tyler and I can use them. Based on the specification, we'll both build as many functions as interest us, and compare notes. Others viewing this site are also welcome to play, and post their work here

-- MishtuBanerjee - 16 Aug 2004

Wiki: "The simplest online database that could possibly work" -- Ward Cunningham.

XAYAcore???: "The simplest relational network that could possibly work???"

XayaPrinciples?

  1. Simplest representation that could possibly work.
  2. A functional programming style (while we are not Lisping, we are reasoning with data)
  3. Focus on logic (and design contracts)
  4. Objects if necessary -- but as few and general as possible.
  5. Clear, readable, documented code
  6. Use generators and iterators idioms for graph traversal (use as few and as consistent constructs as possible)The following article from Dave Mertz' "Charming Python" series provides a nice intro to these new(ish) Python features.
  7. Have fun. As the 7th principle, it is of prime importance wink (ouch)

XAYAUseCase?

A minimal Use Case to begin with, illustrating a command line session:
  1. User provides an unordered list of selected nodes (representing data in db, ascii file, xml doc, etc)
  2. User is provided with a choice of paths
  3. User selects the path they want
  4. Path is translated to a query in the database (initially, "database" is a simple YAML like text-file format. Later, concept of "database" expanded generically -- could be XML, ascii csv text, an OWL Ontology, relational database, etc ).

Rough Procedure to support the Use Case
  1. List of unordered nodes are input to a function.
  2. Function calculates path to root(s) for each node
  3. List of Path is output from Function
  4. Paths that are proper subsets of ofther paths are eliminated
  5. If one or more of the remaining paths includes all nodes in the original list of selections, these paths are presented to user.
  6. If no path incorporates all selected nodes, then user is given a warning that "Multiplication May Occur". User is allowed to select, one or more paths.
  7. One user has made selection -- any common nodes from selected paths are removed.
  8. The final resulting node set is translated to a query in the data source.

Note1 that we go from an unordered set of initial nodes as our input, to a list of ordered nodes as our final output (the path). I'm glossing what the translater has to do, so I can concentrate on the graph logic part.

Note2 -- in terms of graph operations, this use case is more general than LifeLine, since it is assuming acyclic graphs (forests), rather than trees.

Necessary Simplifications for first Iteration

These simplifications are to restrict the scope of the use case

  1. Graphs are stored in a simple text file that imitates the Python dictionary format.
  2. Restrict to single rooted forest. (two brances can still join in a leaf node)
  3. Instead of an unordered list of selected notes as input, a pair of ordered nodes is submitted, representing the PathStartNode? and the PathEndNode?.
  4. Both the keys, and the list consist of strings (i.e. don't worry about Types just yet)

These simplifications make input, output of data to/from files fairly unproblematic, and simplify the nature of the graph traversal algorithms needed to find the path. Once we've got system working with these restrictions, can then begin to loosen them, and develop more general case of graph traversal algorithms.

Minimal Scope Simplified Use Case

  1. Read in a graph from an Ascii file
  2. Generate a NodeIndex? from the graph. Write it to a file as another graph in the Ascii format.
  3. Given a list that is a pair of nodes: [PathStartNode, PathEndNode?], find all paths. Write the paths as a graph where the key is a tuple (PathStartNode?, PathEdgeNode?) and the value is the sorted list of nodes on the path.
  4. Write a Python dictionary, corresponding to our Graph model to a file.
  5. For two hierarchically related graphs (say one defining measures columns, and one defining data tuples), use the index value of the list for "parent" graph to access all the values at that index for every key of the "child" graph. This approximates a simple relational query.

This is a minimal system to generate paths and "join" two graphs. Note -- there are various patterns possible between "parent" and child graphs.

  1. Graphs have same key names, but their lists represent a hierarchical reln. (above example).
  2. Every node in the "parent" graph is actually another graph name in the childs. (i.e. for every key in the parent, there is a graph).
  3. Various other patterns, which represent all possible combinations of a 2D matrix of the following triple representing our graph interpretation of a python dictionary: GraphName?, KeyName?, ValueList?.

XAYASubjectDataGraphs?.

What Does the Graph Mean? 8 Interpretations of DictionaryAsGraphs? to define a Knowledge Management System.

GraphTypeName? Keys Represent Value List Represents Comments
Data InstanceID? Tuple of Values nature of list depends on if instance a table or xml
Measures Data Tables Columns/Variables List of Attributes of a table or XML schema
Schema Tables or Elements Child Tables or Elements Similar to E-R representation
Objects Object Names Corresponding Tables or Elements Data Equiv of Abstract Class
Domains Columns/Variables Max/Min for Numeric, Otherwise Codes Range of Legal Values
Definitions Any other GraphTypeName? or Graph Name A String or Type Defintion Special Case is defining data types
Models Model Name All Graphs In Model A Collection of Related Graphs
Graphs Graph Name All Keys in Graph Master List of All Graphs in memory or on disk

The model here is for a system that at its lowest level is data tables (whether in a db or as text files), and at its highest level is a domain model.

A node is a "Thing"

A directed edge is a non-symmetric relation between "Two Things", represented as pairs of nodes.

A directed graph is a collection "Similar Things" -- a collection of edges identified by the key of the Parent Nodes

A model is a collection of "Related Things", directed graphs -- with specified (in the model) relations between the GraphName?, KeyName?, ValueList? triple in one Graph and another. Essentially, it is a collection of binary relations amongs graphs (call them edges gone meta .....).

An Object is the idea of a Thing, without implementation details (higher level than table, or XML schema structures). It has semantic definitions. And it has relations to other objects (again above the level of table/schema implementation details). The same object could be represented in multiple schema.

XAYA convention will be to name data using the prefix of its interpretation first 4-5 letters. For example: dataTblPolygons, objeStand, measTblPolygons, modelInventory

XayaSubjectDataGraphsExample?

To make this a little less abstract, a small Forest Inventory based example has been created. The example graphs are stored in the attached file "devcore_20040826.zip" which you can download here:

Below follows an explanation of each graph, and the logical process by which we are working down from a high level Ontology to low level data models (or working up in reverse direction for that matter .... or working from the middle up and down, to be most accurate as to how it actually works in practice ... ). For those familiar with the history of data modelling that led to LifeLine ... the modelling process below can be called "Return of the Long Model" (prequel to The Database strikes back).

XAYAFormat

Data is in text files in a format that can easily be converted to Python Dictionary structures where each line of a file has the following form:

Key : Value1, Value2, Value3 .... ValueN?

This format is similar to that used by YAML a data serialization format geared to be human readable and machine parsable, but focussed more towards dynamic languages and programming constructs than typical markup languages.

The common interpretation across all these data structures are:

  • Keys represent parent nodes
  • Values represent child nodes (often terminal leaf nodes).
  • Thus the key-value combos, define all edges in the graph.
  • It's graphs all the way down, and all the way up. (The earth is held up on the backs of turles my son. Who holds up the turtles dad? The turtles are on the backs of other turtles. You mean it's turtles all the way down? Fraid so ..... )

There will have to be fn's to inter-convert from the text format, DB tables, XML docs, Python Dictionaries. Python Dictionaries form the universal translater intermediate form for going between all other forms (ala Safe Software's appraoch to Semantic Translation). The whole thing made language independant via XML-RPC (which can handle dictionaries as structs).

Project Level

Contained in the following files:

DataFiles_X001.txt

  • This file lists all files in a project, including itself.

Graphs_X001.txt

  • This will be an autogenerated file which uses DataFiles?_X001.txt to create a graph where the keys are file names, and the values are all keys in a particular file (i.e. the node list).

Semantics/Ontology Level

Contained in the following files:

Objects_X001.txt

  • Keys are objects, values are attributes of the object

ObjectOntology_X001.txt

  • Keys are objects and represent the "subject"
  • Values are in a list of tuples (rather than simply a list)
  • Each tuple represents a "predicate" (verb) , "object" pair. As in This Subject is/contains/has/pick-yer-verb Object.
  • Format here is more complex: Key : (verbA, Object1), (verbB, Object2) ....

Structural level

This is the part that corresponds to traditional data modelling in databases. These files are fairly close approximations of the LifeLine long model. There are three parrallel files, which have the same key-set, but whose values are different forms of meta-data. The structural level could be said to deal with "collections" and is above the level of individual instances of data. It corresponds to Relational Modelling, as it is usually understood.

Measures_X001.txt

  • Keys correspond to a text file that is a "data table".
  • Values are list of columns/attributes for such a table.
  • This structure would have to be modified somewhat to support XML schemas.

MeasureTypes_X001.txt

  • Keys correspond to a text file that is a "data table".
  • Values are list of column datatypes -- in terms of Pythons built-in types.

MeasureUnits_X001.txt

  • Keys correspond to a text file that is a "data table".
  • Value are units of measurement for each attribute -- m, cm, years,etc.....

DataDictionary_X001.txt

  • Keys are Measures (columns/attributes)
  • Values are the range of data allowed. For numeric data, its a 2 element list of min, max. For coded data, it's the list of allowable codes. For text data it is an empty list (anything goes).

Schema_X001.txt

  • Corresponds to an E-R ModellingDiagram?. Can be used to generate "JTree Tables" ala LifeLine.
  • Keys are Parent Entities
  • Values are their Child Entities
  • I should have used the full file names -- but I used truncated versions (will have to correct).
  • The model I've used is a simplification of a Forest Inventory Warehouse.

Models_X001.txt

  • Key is the model name
  • Values are files in the model.
  • In this case, I included mainly the "structural" and data files in the model. I could have added (and probably should have) all the semantic files.

Mapping Semantics To Structures.

The object level, can be considered akin to abstract classes -- devoid of implementation. The Structural level, deals with the implementation as table, table-like structues (or alternatively in XML world, XML-like, or Schema conformant structures).

We have to map Semantics to Structures. Because any Object may have 1 one or more structural equivalents (particularly if we are integrating data from multiple databsases). So for example an object Stand may correspond to different tables with different names and different variable labels in 2 databases, but both of which describe the same "Object" (a stand) and have some common variables (Leading Species). but may otherwise differ. These mappings allow us to integrate ACROSS data sets where there is some common info.

Map_ObjectsToData_X001.txt

  • Keys are the Object Names
  • Values are the tables that they correspond to, or overlap with.

Map_ObjectAttributesToMeasures_X001.txt

  • Keys are Object Attribute Names
  • Values are the corresponding Measure names (which may or may not be the same).

Data Instance Level

The 11 graphs above are constant in form, and interpretation -- what changes is their contents -- which depend on the actual data (instances) being modelled at a low level). One could say instance information flows through the Structural and Semantic levels, which do not change in form.

In this case, our low level data a model of the forest inventory, corresponding to the schema in Schema_X001.txt. There can be any number of tables in the low level model. Every table has the same interpretation:

Key : Value1, Value2, .... ValueN?

  • Where the Key represents a primary key for the data
  • Where the Values correspond to a tuple of data values (observations) associated with the key.

In our model we have the following tables:

  • datStandStructues_X001.txt -- List of Stand Structures from a classification of forest stand types
  • datPolygons_X001.txt -- Data about Polygons (roughly correspond to "Stands")
  • datPlots_X001.txt -- Data about Plots
  • datProtocols_X001.txt -- Data about the protocols used lay out plots.
  • datPlotStructures_X001 -- List of Plot structures from a plot-based classification of forest stand types.
  • datTrees_X001.txt -- List of Trees that have been measured in Plots.
  • __datMeasurements_X001.txt -- Repeated Measurements on the Trees over Time.

And there you have it, a single data graph data structure taking us all the way from high-level ontologies to low level data; and which is sufficiently flexible to map to ascii data files, XML, Relational Database Tables, and which can be passed across programming languages via the XML-RPC protocol.

One Graph to Rule them All.

XAYAPredicateFunctions?.

Functions that tell you something about a graph, a node, an edge. Currently there are twelve functions. If some of the functions can be defined in terms of others, we can reduce this list to a set of axiomatic functions.

  • trans (transforms one data structure to another e.g. an ascii file to a python )dictionary)

    • read (e.g. into a dictionary)

    • write (e.g. into a file or into a db)

      • print (write to terminal)

  • is (a root node, a leaf node, a parent node, a child node, etc)

  • get (a root node, a leaf node, a parent node, a child node, etc)

  • bind (graph one to graph two based on some relation. Note -- wanted to avoid the word join -- since bind may not correspond to relational join, depending on how its working)
    • What relations are possible between two graphs? Each graph can be represent by 3 sets of information. (1) It's name (scalar). (2) It's key's (collection bound to the name). (3) It's values (collections, each collection bound to a key). So between any two graphs, there is a 3 * 3 matrix of possible bindings (some of which are more common patterns ... other's of which are less common).
      • Pattern 1: BindByValues?. Values of Graph1 (say Measures) bind to specific balues of Graph2 (say the data) So, measure3 is used to pick up all the data at position 3. This is basically similar to how columns work with data tuples in an RDB.
      • Pattern2: BindKeyToNames?. The keys A to Z in Graph1 each are bound to the names of a collection of graphs GraphA? ..... GraphZ?.
      • Pattern 3: BindValuesToKeys?: The values for a Key in Graph 1, bind to keys in Graph 2. For example, say graph 1's keys correspond to Relational Data models, with it's values being all the tables in a model. Graph 2, could then be the schema of Tables in a given model, and it's key's corresponding to individual tables. Such a pattern could be used to allow linking data across multiple models, say via a higher order Business Object model.
      • Pattern 4: BindKeysToValues? : Say Graph 1 and Graph 2 are data tables with a parent-child relationship. The primary key of graph 1, is it's key. The values of this key, are also represented in Graph2's values, at a particular position in the value list. The BindKeysToValues? is like a Parent-To-Child join between the graph representations of these two tables.
      • Other Patterns. Basically follow from the following Matrix Graph1(Name,Key,Value), Graph2(Name, Key, Value). I.e. 9 possible bindings. For the most part, we are interested in hierarchical relationships, where in some sense one graph is a parent of another. That is, there is a 1-to-1 or 1-to-many reln amongst the graphs.

  • split (a path, graph, model into full subsets based on conditions e.g split a graph into subsets using the following criteria to define which subset if falls into)

  • filter (a path, graph, model into partial subsets based on conditions -- e.g. filter a graph so only the selected nodes remain; filter may simply be a special case of split, followed by kill)

  • make (a node, an edge, a graph ....)

  • kill (a node, an edge, a graph ....)

  • gen (a function mapping inputs to outputs, e.g. genNodeIndex takes a graph of node names (keys) and node lists (values) and maps it to a set of character separated integer strings (for both keys and values).

  • sum (takes a sequence of values and returns a scalar -- e.g. sum([pathlist])

The functions once built, can also be made available in a non-language specific way via XML-RPC. In this context there would be an XML-RPC Xaya Client focussed on read/write functions of local info, and an XML-RPC Xaya Server which handles most other functions including any read/write on the server. If you fully enable both client and server, have beginnings of a peer-to-peer framework.

Subject Data Graphs + Predicate Functions = A Little Domain Specific Language Implemented as Graphs

Using the 8-fold model above, that Domain Specific Language represents a knowledge base (which in turn is defined by the actual data in each of the graphs), spanning low level data, to high level semantics. However, other sets of graphs and operations could be defined. For example Graph structures are used to parse simple languages. See Compiling Little Languages in Python by John Aycock for an example.

So, essentially, any give model of graphs -- represents a general problem domain (the 8-fold GraphInterpretationPattern? being the path of Knowledge Representation .... if not actually wisdom)

In addition to the general functions defined above, additional domain specific functions may be defined, using the general functions as their components. So, we are building "from the bottom up" several layers of language. For more on this, see the general article by Paul Graham on Programming Bottom Up

XayaCoreSpecification?

Finally ..... The XAYA001 specification

1. readFromFile(FilePathOrName?) -- reads from ascii format either a graph or a path (a path is simply a 1 row dictionary with a key and a value list)

2. genNodeIndex(GraphName?)

3. findAllPaths(GraphName?, StartNode?, EndNode?)

4. filterShortestPaths([PathsList]) -- a list of lists is the PathsList?

5. filterLongestPaths ([PathsList])

6. sumPathLength([Path])

7. findRoots(GraphName?,[NodeList])

8. isRoot(GraphName?, NodeKey?)

9. writeToFile(GraphOrPathName?, FilePathOrName?)

10. filterGraph(GraphName?, [NodeList])

11. bindGraphs(ParentGraph?, ChildGraph?) -- may require either several fns -- or a single fn with a longer list, to define differnet ways of binding two graphs. I always imagine this as the weigh proteins bind.

12. makeNode(GraphName?, NodeName?)

13. makeEdge(GraphName?, ParentNode?, ChildNode?)

14. killNode(GraphName?, NodeName?)

15. killEdge(graphName, NodeName?)

Some More Rough Algorithms

.... To be continued, as the implementation is worked out.

General Tree Walking Algorithm For each NodeKey?, walk through it's edges. Build a list from nodes to edges. Each edge then recursively becomes a node key. Continue until: have hit a leaf node, or encounter a node already in the list (i.e. a cycle). Make sure you follow alternate branches, and don't just end up running down one branch.

Node Index Generation To generate an index such as 1.1, 1.2.1, etc, defining node locations in the graph. Once such an index is generated -- operations on it, can be easier than full graph-walking algorithms (which don't make any assumptions about the graph's structure. In particular, regular expressions on the lists of node indexes can be used to find the paths.

XAYAObjects

Currently in the first iteration there are no "official" objects. Though obviously we are using some object oriented/relational hybrid thinking in our modelling approach. Ultimately, we may create an object such as DiGraph?(dict), inheriting from the built-in dictionary type. This is a bit different than some other approaches which usually proceed by defining node, edge classes, and several types of graphs: graph, digraph, tree. This leads to a menagery of Graph related objects.

I'm thinking of something much simpler. A single base class (DiGraph?). Essentially extends the basic capabilities of a dictionary. Their may also need to be a containeer class, for operations "across graphs" rather than operations within a graph.

Focussing on DiGraph?, we can further sub-class, and actually represent nodes, and edges, as themselves inheriting from the DiGraph? class:

DiNode?(DiGraph?) has a name, that is the node's name. It's keys represent data associated with the node ("cargo") and the values are the values of the data.

DiEdge?(DiGraph?) has a name, that is the string-representation of two-nodes (e.g "A-->B), and whose values are the tuple (Parent, Child).

With just these constructs one should be able to build fairly complex models, including ones where a single node in one model, may itself contain a whole other model. Consider this for cases where different models exist at different scales, or where one has nested models.

The current functions then become methods within this class. And subclassing can over-ride what they do, and add further methods.

XAYALanguage

So far, we've defined the mini-language as a library in Python (i.e. we're using Python itself as a little language, in the manner illustrated in the Python Cookbook, pg 449, which also contains yet another graph implementation example). The next step, is to go further and actually extend the Python language. The notes here are some very preliminary ideas as to what such an extended language might look like.

Can we, based on the constructs we've created, define a simple extension to the Python core language, to represent Graph operations -- a type of relationally modelled path language. (call it XayaPath?). If we could, it would allow Python to add to it's current functional and OO styles of programming, the declarative style of a logic/relational language.

I've got some rough notes in my Xaya journal right now, that I have to try out.

Without referring to the notes (scattered over several pages of journals, and in the backs of several other pieces of paper) here's the general sketch of the idea.

Define a declarative style language that defines a path , using keywords, operators, tokens not currently used by Python (i.e. this language + Python is a superset of Python). As in any declarative language, it states "what is wanted", not "how it is calculated" (done by the underlying system, in this case the query engine functionality)

Say there are 2 graphs, G1 and G2 (represented by perhaps the DiGraph? class).

Notation Thoughts (very rough)

  • | | Indicates beginning and end of a path statement within a graph. (Intra-Path)

  • || || Indicates beginning and end of a path statement across graphs. (Inter-Path)

  • |{ Names }|

  • |( keys )|

  • |[ Values ]|

  • |{G1} (Key1 : KeyN?)[Var5,Var6,Var7] | Filters Graph 1 for Keys 1 to N and Projects Values 5,6,7 only. In SQL terms, I just said something like "SELECT Var5, Var6, Var7 FROM G1 WHERE (Keys < (N + 1))" The "path statement" used app 27 characters, where the SQL form used app 40. One could see how, if the source data was stored in an RDB, the path statement could directly translate to an SQL query. Similar could be likely developed if the data storage source was an XML document.

  • --> Bind Operator

  • || "Some Set of InterGraph? Operations Placed Here" || Specifies chained graph operations

  • || {G1}| --> |{G2}|, ByValues?) || -- Bind Graph G1 to Graph G2 by Values.

  • || {G1} (Key1 : KeyN?)[Var1 : Var4] --> {G2}( : )[Var5, Var6], ByValues? || Which says something like "SELECT G1.Var1, G1.Var2, G1.Var3, G1.Var4, G2.Var5, G2.Var6 FROM G1, G2 Where (G1.Keys = G2.Keys) AND (G2.Keys < (N+1)". Note in this case, the number of tokens from the path statement vs the SQL-like equivalent are roughly: 58 vs 100.

  • >>> If len(|| {G1} (Key1 : KeyN?)[Var1 : Var4] --> {G2}( : )[Var5, Var6], ByValues? ||): "You Have Data"
    • Finally, you should be able to mix "path statements" with regular Python. (More "readable" if the path statement was first bound to a variable name such as >>> myPath = || {G1} (Key1 : KeyN?)[Var1 : Var4] --> {G2}( : )[Var5, Var6], ByValues? ||) )
      • which leads to >>> If len(myPath) : "You have data"

Admittedly the paths are a bit cryptic -- but this demonstrates they can be "expanded out" into SQL-like, Xpath-like, X-Query-like statments by path-translator functions. This allows a common internal representation of paths, that can then get executed amongst heterogenous data sources.

Someone with a better eye for readability, should see if there's alternative symbol sets that are both consise, and easy to read. Also, it's important that there be no ambiguity in the interpretation (which currently there might be).

Our thoughts here, are really expressing "paths" in terms of 3 operations. Intersections amongst sets represented by graphs via "-->"; constraints via "(keys list)" and projection via "[variables list]". So, every path is some combination of intersection of sets (including itself), constraints on a set, and projection of set attributes. In short, if you can represent collections of data as sets, this language should be able to operate on those sets in a consistent manner. In terms of the transition from "low level data" to "high level semantics/ontologies", what one is doing is really building a hierarchical model that across sets provides the connecdtions between the high level (say the ontology) and the low level (the actual data, in it's native storage format).

Somewhere behind all this is the idea that Lisp notation reflects the actual parse trees of the language. And I'm imagining something similar in Python for these path statements. Of course, I'm really not sure how Lisp parse trees work so, the analagy might be spurious :-{

XAYAcode

These are some initial code, developed to illustrate ideas for XAYA in its initial iteration. These should not be taken as the final code, for the first iteration, but really a working out of code conventions and initial command line interface via proto-typing. Expect the code to be updated every few days. Top link, is most recent version of code (so one gets a running view of the developing style). Current conventions are to use a test-first development style (define unit tests, write code)via unittest module, a pseudo-literate coding style that provides documentation of any algorithms used, and references to code sources, and finally incorporation of usage examples via docTest module. Code is currently more verbose than it needs to be, and will get stripped down. Try and balance conciseness and clarity. (emphasis on "try").

Original -- MishtuBanerjee - 16 Aug 2004

Revised -- MishtuBanerjee - 30 Sep 2004

META FILEATTACHMENT devcore_20040826.zip attr="" comment="Prototypes as of August 26, 2004" date="1096586045" path="C:\XAYA\devcore_20040826.zip" size="8585" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040914.zip attr="" comment="Prototypes as of September 14, 2004" date="1096586069" path="C:\XAYA\devcore_20040914.zip" size="21282" user="utsim" version="1.1"
META FILEATTACHMENT devcore_20040921.zip attr="" comment="Prototypes as of September 21, 2004" date="1096586105" path="C:\XAYA\devcore_20040921.zip" size="146169" user="utsim" version="1.1"
Revision r1.1 - 30 Sep 2004 - 23:15 - Main.utsim
Revision r1.21 - 21 Jan 2005 - 06:09 - Main.utsim