Research on adaptable backend databases

Pat_L · November 4, 2011, 1:10pm

Does anyone know of research on designing databases to be more adaptable? I am not a computer scientist, but it seems to me that for big databases, changing the database "on the fly" is always difficult. It seems this is a serious impediment to making data gathering projects adaptable. ODK Collect seems to provide great potential to push new data fields to data gatherers in the field. How are such mid-stream changes dealt with on the database side?

Please forgive and respond directly to me if this is not considered list relevant.
-Pat

Patrick Lorch
Biological Sciences Dept.
Kent State University
256 Cunningham Hall
Kent, OH 44242-0001 USA
O: 330-672-7888
http://drosophila.biology.kent.edu/users/plorch/lorch.html

W_Brunette · November 4, 2011, 4:03pm

This is a problem we plan to address in the future using versioning
but the DB will not change on the fly there will be an old
representation and a new representation but it should make it easier
on the user (let me reiterate again this is in the future). What you
describe is a hard problem because to keep it one set of tables how do
you deal with the missing data fields as users may want to run queries
on pieces of data that was not available in one set or another...
determining the best behavior in this case is pretty application
specific and it is why databases haven't made this general and easy
because there is not a clear "right" answer on what should be done.

Waylon

···

On Fri, Nov 4, 2011 at 6:10 AM, Patrick D. Lorch wrote: > Does anyone know of research on designing databases to be more adaptable? I > am not a computer scientist, but it seems to me that for big databases, > changing the database "on the fly" is always difficult. It seems this is a > serious impediment to making data gathering projects adaptable. ODK Collect > seems to provide great potential to push new data fields to data gatherers > in the field. How are such mid-stream changes dealt with on the database > side? > Please forgive and respond directly to me if this is not considered list > relevant. > -Pat > Patrick Lorch > Biological Sciences Dept. > Kent State University > 256 Cunningham Hall > Kent, OH 44242-0001 USA > O: 330-672-7888 > http://drosophila.biology.kent.edu/users/plorch/lorch.html > > -- > Post: opendatakit@googlegroups.com > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > Options: http://groups.google.com/group/opendatakit?hl=en >

Pat_L · November 4, 2011, 4:29pm

I appreciate what you are saying. I am glad to hear that there is
interest in dealing with this in the future. I am working on a grant
where on part would be to try to design some ways of dealing with this
for scientific data gathering efforts. We may be able to contribute to
future efforts to deal with this problem.

yanokwa · November 5, 2011, 2:29am

one approach is to the entity-attribute-value model
(http://en.wikipedia.org/wiki/Entity-attribute-value_model) and store
just key-value pairs for the backend.

this will likely require a whole
new backend so it's even further out. you can find an example of this
in openmrs.

···

On Fri, Nov 4, 2011 at 19:29, Pat L wrote: > I appreciate what you are saying. I am glad to hear that there is > interest in dealing with this in the future. I am working on a grant > where on part would be to try to design some ways of dealing with this > for scientific data gathering efforts. We may be able to contribute to > future efforts to deal with this problem. > > -- > Post: opendatakit@googlegroups.com > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > Options: http://groups.google.com/group/opendatakit?hl=en >

Pat_L · November 6, 2011, 3:03pm

I can see that survey based data gathering and health databases are
very different in their likely volatility. I guess if my goal is to
set up a data gathering system that is adaptable, an (at least
partial) EAV approach might make more sense, though I can imagine this
approach can be more difficult to design on the backend.

Thanks for your help.
-Pat

···

On Nov 4, 9:29 pm, Yaw Anokwa wrote: > one approach is to the entity-attribute-value model > (http://en.wikipedia.org/wiki/Entity-attribute-value_model) and store > just key-value pairs for the backend. > > this will likely require a whole > new backend so it's even further out. you can find an example of this > in openmrs. > > On Fri, Nov 4, 2011 at 19:29, Pat L wrote: > > I appreciate what you are saying. I am glad to hear that there is > > interest in dealing with this in the future. I am working on a grant > > where on part would be to try to design some ways of dealing with this > > for scientific data gathering efforts. We may be able to contribute to > > future efforts to deal with this problem. > > > -- > > Post: opendatakit@googlegroups.com > > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > > Options:http://groups.google.com/group/opendatakit?hl=en

Mitch_Sundt · November 7, 2011, 6:13pm

If you go this way, use a SQL database (PostgreSQL, MySQL, Oracle, MS
SQLServer).
You will need strong ACID guarantees. Google BigTable or Amazon SDB will
not be appropriate.

While such an approach allows considerable extensibility, it is likely to
make data visualization and integration with 3rd party data visualization
software more difficult because of the highly reduced, generic, data model.

Mitch

···

On Sun, Nov 6, 2011 at 7:03 AM, Pat L wrote:

I can see that survey based data gathering and health databases are
very different in their likely volatility. I guess if my goal is to
set up a data gathering system that is adaptable, an (at least
partial) EAV approach might make more sense, though I can imagine this
approach can be more difficult to design on the backend.

Thanks for your help.
-Pat

On Nov 4, 9:29 pm, Yaw Anokwa yano...@gmail.com wrote:

one approach is to the entity-attribute-value model
(http://en.wikipedia.org/wiki/Entity-attribute-value_model) and store
just key-value pairs for the backend.

this will likely require a whole
new backend so it's even further out. you can find an example of this
in openmrs.

On Fri, Nov 4, 2011 at 19:29, Pat L plorc...@gmail.com wrote:

I appreciate what you are saying. I am glad to hear that there is
interest in dealing with this in the future. I am working on a grant
where on part would be to try to design some ways of dealing with this
for scientific data gathering efforts. We may be able to contribute to
future efforts to deal with this problem.

--
Post: opendatakit@googlegroups.com
Unsubscribe: opendatakit+unsubscribe@googlegroups.com
Options:http://groups.google.com/group/opendatakit?hl=en

--
Post: opendatakit@googlegroups.com
Unsubscribe: opendatakit+unsubscribe@googlegroups.com
Options: http://groups.google.com/group/opendatakit?hl=en

--
Mitch Sundt
Software Engineer
http://www.OpenDataKit.org
University of Washington
mitchellsundt@gmail.com

Patrick_Lorch · November 8, 2011, 2:06am

Thanks for all the information. I had not anticipated the
visualization limitations.
-Pat

···

On Nov 7, 1:13 pm, Mitch Sundt wrote: > If you go this way, use a SQL database (PostgreSQL, MySQL, Oracle, MS > SQLServer). > You will need strong ACID guarantees. Google BigTable or Amazon SDB will > not be appropriate. > > While such an approach allows considerable extensibility, it is likely to > make data visualization and integration with 3rd party data visualization > software more difficult because of the highly reduced, generic, data model. > > Mitch > > > > > > > > > > On Sun, Nov 6, 2011 at 7:03 AM, Pat L wrote: > > I can see that survey based data gathering and health databases are > > very different in their likely volatility. I guess if my goal is to > > set up a data gathering system that is adaptable, an (at least > > partial) EAV approach might make more sense, though I can imagine this > > approach can be more difficult to design on the backend. > > > Thanks for your help. > > -Pat > > > On Nov 4, 9:29 pm, Yaw Anokwa wrote: > > > one approach is to the entity-attribute-value model > > > (http://en.wikipedia.org/wiki/Entity-attribute-value_model) and store > > > just key-value pairs for the backend. > > > > this will likely require a whole > > > new backend so it's even further out. you can find an example of this > > > in openmrs. > > > > On Fri, Nov 4, 2011 at 19:29, Pat L wrote: > > > > I appreciate what you are saying. I am glad to hear that there is > > > > interest in dealing with this in the future. I am working on a grant > > > > where on part would be to try to design some ways of dealing with this > > > > for scientific data gathering efforts. We may be able to contribute to > > > > future efforts to deal with this problem. > > > > > -- > > > > Post: opendatakit@googlegroups.com > > > > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > > > > Options:http://groups.google.com/group/opendatakit?hl=en > > > -- > > Post: opendatakit@googlegroups.com > > Unsubscribe: opendatakit+unsubscribe@googlegroups.com > > Options:http://groups.google.com/group/opendatakit?hl=en > > -- > Mitch Sundt > Software Engineerhttp://www.OpenDataKit.org > University of Washington > mitchellsu...@gmail.com