LIS2600 Course Blog: September 2014

Friday, September 26, 2014

Muddiest Point for Week 4

I still do not understand the question I asked at the lab this week. For example, when I imported the "book.xlsx" from assignment #2, the bookid column's indexed showed: "Yes (Duplicates OK)". However, other columns were all showed: "No".

I don't know what does this mean, and what is the difference between "Yes (Duplicates OK)", "Yes (No Duplicates)", and "No". I just want to figure out is this make any differences.

Notes for Week 5 Required Readings

Anne J. Gilliland. Introduction to Metadata, pathways to Digital Information: 1: Setting the Stage
http://www.getty.edu/research/publicationa/electronic_publications/intrometadata/setting.pdf

This reading gives me a general idea about metadata, what it means, and introduces its history and development. What interests me most in this chapter is how metadata works in information organizations, such as libraries and museums.

Generally speaking, metadata is "data about data", it is anything that can be used to describes objects (information), including the content, the context, and the structure. More specifically, for libraries and museums, when speaking of metadata, it is an emphasis on how to provide and enhance access to value-added information and collection materials.

There are several important aspects of metadata.

Data Standards - for shared cataloging and exchange of descriptive data.
Structure - with the development of computer-processing capabilities, the role of structure has been growing. Specifically, the more highly structured an information objects is, the more that structure can be exploited for searching, manipulation, and interrelating with other information objects.

Also, we should notice that "Metadata can be, and should be considered more inclusively conceptualized". By support this point of view, the author gives us several examples about different types of people's ways to think of metadata. Therefore, theory and practices can be vary due to the differing professional and cultural missions.

Moreover, except explains several primary functions of metadata, the author also reveals some little known facts about it, which is really refreshing to me. The following two are the most surprising facts to me.

Metadata does not have to be digital.
Except the description of an object, metadata can also indicate the context, management, processing, preservation and structure.

Eric J. Miller. An overview of the Dublin Core Data Model.
http://dublincore.org/1999/06/06-overview/

This article gives us an introduction about an incomplete work, the Dublin Core Metadata Initiative (DCMI). It explains DCMI's requirements, basics, and what they did to support the DCMI. I got the general idea about what is this DCMI for. However, I still felt lost about those technical parts mentioned in this article.

According to the author, DCMI is designed to foster consensus across disciplines for the discovery-oriented description of diverse resources in an electronic environment.

Julie Meloni. Using Mendeley for Research Management.

http://chronicle.com/blogs/profhacker/using-mendeley-for-research-management/25627

This article introduces a research management tool, Mendeley.

Key features

Organize - indexes and organizes your own document collections into a bibliography.
Share - also works as a social network, stay up to date with what other people in your field are reading.
Discover - "navigate the web of knowledge", discover the "most reads".

Friday, September 19, 2014

Notes for Week 4 Required Readings

Database
http://en.wikipedia.org/wiki/Database

DBMSs: Data definition; Update; Retrieval; Administration.
Maintaining stored data's integrity & security.

Database & DBMSs: database is not generally portable across different DBMSs;
different DBMSs can interoperate by using standards.

Database design
Produce a conceptual data model (the structure of the information), by: develop an entity-relationship model; or Unified Modeling Language. Need good understanding of the application domain.
Translate the model into a schema (logical database design). Normalization, help the database maintain consistency.
Physical database design (data independence; security).

DBMSs' three views
External level: user view (can be any number)
Conceptual level: for developers and administrators
Internal level: computer view (only one)

Data independence: changes made at a certain level do not affect higher level.

Database languages
Data definition language: defines types and relationships;
Data manipulation language: perform tasks;
Query language: searching and computing info.

Complex mechanisms are needed.
Storage (storing materialized views to save storage redundancy)
Replication (increase data availability)
Security
Transactions & concurrency
Migration (transform the database from one DBMS to another)
Building, maintaining, and tuning
Backup & restore

Entity relationship model in database
http://en.wikipedia.org/wiki/Entity-relationship_model

Entity: thing (nouns)
Relationship: captures how entities are related to one another (verbs)

ER diagramming tools: MySQL Workbench, Open ModelSphere, etc.

Database normalization process
DatabaseNormalizationProcess.pdf

This tutorial is for beginners to get a general idea about the database normalization process.

Here are some info I got from this article.

Three normal forms
(1) First normal form: No repeating elements or groups of elements
NF1: atomicity (get rid of repeating elements)
uniqueness

(2) Second normal form: No partial dependencies on a concatenated key
test each table for partial dependencies on a concatenated key
(columns cannot only depends upon one part of the concatenated key)

(3) Third normal form: No dependencies on non-key attributes.

However, after reading this, I still did not know how to deal with the database normalization process specifically, maybe I can understand this better with the pictures and diagrams that missed in this tutorial or by doing some practical exercise.

Friday, September 12, 2014

Muddiest Point for Week 2

1. The "YouTube and libraries" article says “Be very careful about what you upload onto YouTube.” This implies YouTube’s copyright regulations. However, what is the specific ways for them to identify whether the video is original or it is violated the copyright? In other words, how to ensure the copyright if a library upload a video about the information resources they have?

2. A question about the blogger.
Why there is a white background under some of the sentences I posted? Maybe there was something wrong when I pasted my notes from my Word document? Or maybe I should just text on the blog directly next time?

Notes for Week 3 Required Readings

The following two reading materials gives me a general idea about data compression and helps me understand some basic features about compression and ways to approach it.

Data Compression.

http://en.wikipedia.org/wiki/Data_compression

Data compression basics

http://dvd-hq.info/data_compression_1.php

Compare Lossless & Lossy

Lossless compression	identifying and eliminating statistical redundancy	No information lost	Reversible
Lossy compression	identifying unnecessary information and removing it	Information lost is acceptable	Non-reversible (not sure)

Why data needs to be compressed?

Help reduce resource usage → data storage space; transmission capacity

Factors need to be considered when design a data compressing scheme:

The degree of compression;

The amount of distortion introduced;

The computational resources required to be compressed and uncompressed.

Uses

[Audio]

-Lossless: can decompress to an exact duplicate of the original one;

unable to attain high compression ratios

(complexity of waveforms & rapid changes in sound form);

--Combination: allows stripping the correction to easily obtain a lossy file

--Lossy: achieves far greater compression (discarding less-critical data);
audio quality suffers when decompressed and recompressed
(unsuitable for professional audio engineering applications but
popular with end users)

[Video]
-spatial image compression & temporal motion compensation;

--majority: use lossy compression

[Image]
-improves the compression ratios by: channel sorting; reducing number of colors.

-quality: decrease

The following reading materials gives me a practical example related to data compression, especially in image field.

Edward A. Galloway, “Imaging Pittsburgh: Creating a shared gateway to digital image collections of the Pittsburgh region” First Monday 9:5 2004 http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1141/1061

This article introduces a project (1 Nov. 2002 to 31 Oct. 2004) about create a gateway to visual image collections in the Pittsburgh region, including the grant partners, the image collections, the purpose, the progress, the challenges, the solutions, the outcomes and impacts.

Among all the information of this "Imaging Pittsburgh" project, the challenges they've encountered and the solutions they've came up with are parts that interests me most.

Communication challenges & solutions

Challenges:

-A lack of dialogue outside the meetings;

-Little or no communication on the listserv and the projects Web site;

-Different missions and institutional cultures among partners.

Solutions:

-Project e-mail distribution list;

-A web site for posting documentation;

-Monthly meetings;

-Build a common dialogue for discussing critical elements of the project.

Selection challenges & solutions

Challenges:

-Scanning capabilities such as size, format condition, etc.

-Ensure the collection as a whole is balanced;

-Split collections

Solutions:

-The use of subject headings;

-Curators from both institutions select images and make decisions
--together.

Metadata challenges

Challenges:

-Wide metadata needs VS. Local needs;

-Choosing controlled vocabulary;

-The use of dates.

Solutions:

-Set eight Dublin Core elements, while each institution can include
-additional fields;

-Use controlled vocabulary terms from LCSH when cataloging images;

-Create two date fields, one for computer sorting, one for users.

Workflow challenges

Challenges:

-Creating a workflow to get the images through numerous processing
-steps;

-Image quality;

-The creation and use of identical software applications or field
-structure due to the unique set of partners.

Solutions:

-Share different workflows and ideas, reconsider and incorporate new
ones;

-A minimum for DRL, a higher quality for other purposes (depends on
the institution itself);

-Insisted that the appropriate and necessary metadata fields be able
-to be exported.

Web site development challenges

Challenges:

-Develop consistent copyright and permission statements;

-Best way to communicate with users;

-Emphasizing access to the collection as a whole, while also
maintaining the individual identity of each collection;

-The limitations of the DLXS Image-Class middleware and internal
-resources

Solutions:

-Two-fold strategy → developed a generic copyright; each individual
-image’s metadata contains a copyright field;

-Develop a sophisticated e-mail system, while continue to using the
current e-mail distribution list;

-Plan on conduction an online survey, and create OAI records.

Avenues for exploring

Challenges:

-Develop creative ways to help users explore the collections.

Solutions:

-Cataloging the images (via time, place, and theme);

-Clickable city map.

The following article provides an eye-opened way for libraries to share their resources and services, especially libraries’ instruction and training.

Paula L. Webb, YouTube and libraries: It could be a beautiful relationship C&RL News, June 2007 Vol. 68, No. 6

http://crln.acrl.org/content/68/6/354.full.pdf

Advantages:

-- The most popular internet television or video distribution site;

-- Easy to get started;

-- Provide easy access to guides anywhere;

-- Notifications & RSS;

-- Good for visual learners.

One question:

The article says “Be very careful about what you upload onto YouTube.” This implies YouTube’s copyright regulations. However, what is the specific ways for them to identify whether the video is original or it is violated the copyright? In other words, how to ensure the copyright if a library upload a video about the information resources they have?

Friday, September 5, 2014

Notes for Week 2 Required Readings

A Few Thoughts on the Google Books Library Project

http://www.educause.edu/ero/article/few-thoughts-google-books-library-project

The author’s opinion is that accessible digital formats is the only way to guarantee knowledge’s survival. To support his point, he shows evidence of the increasing use of digital sources, and gives us some examples about how Google succeed in digitizing information and knowledge to make it widely available.

As a future librarian, this article makes me reconsidered the importance of using effective technology to make information easily accessible to a library. Though there are always exists many problems for the libraries to conquer when introducing new technology into libraries. At the same time, it also makes me thought about libraries’ attitude towards private companies such as Google. I always see these companies as a competitor to libraries, especially in the information age. However, after reading this article, I started to think that maybe we should try to seek opportunities to cooperate with them, and make benefits for both of us from the cooperation. Actually, the next article gives me a very specific example about the problems when introducing new technologies, and the third article provides some successful cases about libraries’ cooperate with private companies.

Vaughan, J. (2005). Lied Library @ four years: technology never stands still. Library Hi Tech, 23(1), 34-49.

This article gives us a very specific case about how Lied Library brought new technology into their system, what problems and challenges they have met, and how they conquered those problems and even turned those challenges into opportunities.

There is one thing for sure for libraries in this fast-changing world: that is we must embrace all the changes and actively introducing new technologies into our libraries. However, there is never a easy way for us to make such adjustments. When bringing a new technology into the library, there are usually three major processes, early period of preparation, interim period of implementation, and after period of maintenance and upgrade. There are some common problems and challenges for all the three processes: money problems, human resources problems, legal issues, etc. When facing these problems, except the approaches this article has mentioned, I believe that the key to success is all different for each of the libraries. A better understanding of the libraries’ own situation is always what we should do before killing ourselves to think about the best solutions. Another tip would be, never limit your thoughts. Cooperate with private companies would be a really good example for breaking the “rules”, and European libraries have some great experiences for us to learn.

Doreen Carvajal. European libraries face problems in digitalizing. New York Times. October 28, 2007

http://www.nytimes.com/2007/10/28/technology/28iht-LIBRARY29.1.8079170.html

This article provides us some great examples about libraries seeking different funding models, especially cooperate with private alliances.

These successful examples inspires me that maybe it's time for libraries to break patterns, and make a move. For instance, when talk about our relationship with privacy companies, such as Google, Amazon, and some publishers, I think just like the Bibliothèque Nationale de France's new president, Bruno Racine said, "We are not at war, so to speak." May those privacy companies are still our competitors, it doesn't mean that we can not work as alliances. But still, we need a healthy market mechanism and necessary legal and technology measures to guarantee that.