File: www.idi.ntnu.no/grupper/su/publ/db-papers/citation-db-eval.html
Reidar Conradi
IDI, NTNU
conradi@idi.ntnu.no
24.06.2011
On the utility of 10 major citation databases for scientific papers
Table of contents
1. Short examples of using DB.1-D.B10 for Sjøberg (DIK) and Conradi (RC).
2. Characterizing the major citation databases
2.1 DB.1 ISI Web of Knowledge (now by Thomson Reuters)
2.2 DB.2. Microsoft Academic Research (MAR)
2.3 DB.3. Harzing's Publish and Perish (PoP)
2.4 DB.4 GoogleScholar
2.5 DB.5 DBLP
2.6 DB.6 CiteseerX
2.7 DB.7 Arnetminer
2.8 DB.8 Cristin (ex-Frida)
2.9 DB.9 NTNU-SU group's publication list
2.10 DB.10 SCOPUS (life and physical sciences)
3. Concluding remarks
4. Appendix: Some big examples for Sjøberg and Conradi regarding PoP.
4.1 CASE 1: Some PoP-errors in the 109 papers in pop-sjoberg-all.tex
4.2 CASE-2: The missing 12 PoP-papers in pop-sjoberg-excl-chem-mat.tex
4.3 CASE-3: Prune Conradi's 337=>297 papers on pop-conradi-EngCSMath.tex
Log:
24.Sep.2001 (RC): Adjusted the descriptions for
ISI Web of Knowledge and Arnetminer.
26.Sep.2001 (RC): Added SCOPUS, empty in the start.
1. Short examples of using DB.1-DB.10 for Sjøberg (DIK), plus SU-group:
Conradi (RC), ... Jaccheri (MLJ), and ... Daniela S. Cruzes (DSC).
Subject (topic): "IT", Period: year 0-2010, PhDs: only own,
'na' means not available.
#publications #citations G-index H-index
DB.1 ISI DIK: 38 413 ? 12
RC: 89 ** 394 ? 11
MD: 7 7 ? 2
MLJ: 30 ## 65 ? 5
TSk: 8 0 ? 0
TSt: 42 52 ? 3
AIW: 17 (23-6) 5 ? 2
DSC: 9 8 ? 2
GSindre: 46 377 ? 7
JAGulla: 18 22 ? 4
DB.2 Microsoft Acad. Res. DIK: 71?? 438 18 12
RC: 209 1560 33 11
MLJ: 57 ## ? ? ?
DSC: 17 ? ? ?
DB.3 Publish and Perish DIK: 88 ? ? ?
RC: 297 ? ? ?
MLJ: ? ? ? ?
DB.4 GoogleScholar DIK: 449 na? na na
RC: 1070 na? na na
MLJ: ? na? na na
DB.5 DBLP DIK: 57 ? na na
RC: 127 ? na na
MD: 26 ? na na
MLJ: 34 ? na na
TSk: 4 ? na na
TSt: 40 ? na na
AIW: 29 ? na na
DSC: 17 ? na na
DB.6 CiteseerX DIK: 21 ? na 5
RC: 139 ? na 14
MLJ: ? ? na ?
DB.7 Arnetminer DIK: 50 883 na 16
RC: 129 2374 na 24
MD: 26 339 na 10
MLJ: 30 ## 205 na 7
TSk: 1 0 na 0
TSt: 34 268 na 8
AIW: 28 120 na 7
DSC: 14 60 na 3
DB.8 Cristin DIK: 128 na na na
(last 10-15 years) RC: 208 na na na
MD: 67 na na na
MLJ: 81 na na na
TSk: 57 na na na
TSt: 84 na na na
AIW: 182 na na na
DSC: 28 na na na
DB.9 NTNU-SU DIK: na na na na
RC: 207(45 jour.) na na na
MLJ: 90 ## na na na
DB.10 SCOPUS (not tried) DIK: ?? ? ? ?
RC: ?? ? ? ?
MLJ: 27 ? ? ?
Comments:
##: MLJ (Jaccheri) has totally 90 entries (by SU), 27 (by ISI)
and 30 (by ARNET)!!, with 20 in overlap between the latter two,
i.e. 37 from these two DBs, including 2 books and 3 book-chapters.
Remaining 50 publ.s: ca. 25 of sufficient quality
(but not found in citation DBs) and
ca. 25 informal (OK to be separately listed
outside DBs?).
MAR database has 57 publ.s, which for MLJ seems OK in size,
but contents remain to be checked.
See also PoP examples CASE-1, CASE-2 and CASE-3 in Appendix.
DB.1 ISI Web of Knowledge
=========================
Need a (site) license, e.g. via NTNU.
Covers all fields, but only articles in indexed jouirnals.
Thus, it misses most books, book chapters, and not to forget
good conference papers (ICSE, VLDB, IJCAI, OOPSLA, and similar).
ISI recently won over SCOPUS to be the main "import channel" to Cristin.
URL: http://apps.webofknowledge.com/
(This database tool is easier to use than the URL listed below.)
Be sure you are in the "Web of Science".
Look at the menu below the name "Search", and
fill-in the "Author" slot with a text like:
Sjoberg DIK
Conradi R*
(Jaccheri L*) OR (Jaccheri M*)
Then click on the below "Search" (or "Clear") button.
Ex. With a query of (Conradi) and no further preferences,
we initially get 1027 publication entries.
So need to click on the submenues of "Refine" results" under "Web
of Science Categories", like
ENGINEERING ELECTRICAL ELECTRONIC (74) and
COMPUTER SCIENCE THEORY METHODS (54),
and so on for more sub-categories.
At last, we may end up with 127 publication entries,
each containing a DOI and lots of other info; really impressive!
To see all the entries, click on "Create Citation Report" text,
standing furthest to the right of "Refine" results".
There is also a manual selection facililty to "fine-prune" the "catch".
URL: http://apps.isiknowledge.com, cf. also above.
Need a (site) lisence, e.g. via NTNU.
Covers all fields, but only articles in indexed journals.
Thus, it misses most books, book chapters, and not to forget
good conference papers (ICSE, VLDB, IJCAI, OOPSLA, and similar).
First set the button 'Limit to:', then select 'All Years' or similarly.
Then go to 'Web of Science' in the top headings, and
now select a more precise year interval.
Then click on the 'Author Finder' command:
Step 1: Enter Author Name, e.g. 'Conradi R'.
Step 2: Select Author Variant, e.g. 'Conradi R' => 'Conradi R*'.
Step 3: Select Subject Category, choose at least one of:
LIFE SCIENCES & BIOMEDICINE
MULTIDISCIPLINARY SCIENCE & TECHNOLOGY (use this!)
PHYSICAL SCIENCES
Step 4: Select Institution from a list, and max 50 such.
Ex. I don't include UNIV HEIDELBERG, if 'Conradi R' is selected.
Lastly click on 'Finish Now'.
You may now fine-tune your selection by sub-subjects and document types.
Then:
indicate 'Create Citation Report' on the top-right, and
cut&save the displayed summary with your h-index etc.
Possibly also:
indicate 'Records' on the bottom-left, and
specify a min-max interval (e.g. papers ordered from '1' to '100'),
about which documenttypes to consider,
specify whether to include an abstract (say No!) etc., and
finally specify the record format (e.g. BibTex) and a file name
to store the textual records.
DB.2 Microsoft Academic Research (MAR)
======================================
URL: http://academic.research.microsoft.com/Organization/13557 (for NTNU).
Free to use, covers all fields, appears to be OK.
Start to click on 'Advanced search',
then select 'Computer Science'.
Choose first 'Author' (gives #papers, #citations, G/H-index),
then 'Publication' (to get BibTex files).
User interface is a bit messy, e.g . must write 'Dag I.K. Sjoberg'.
DB.3 Harzing's Publish and Perish (PoP)
=======================================
URL: www.harzing.com/pop.htm,
First a free, executable version must be downloaded and
installled on your PC.
It covers all fields, gets data from GoogleScholar - thus immature.
Often start by clicking on 'Citation analysis' on top-left,
then e.g. submeny 'Author impact analysis'.
Then give your author name (e.g. 'R Conradi'),
possibly some excluded names (e.g. 'FR Conradi'),
a year interval, and finally
select one or more of the seven subjects below:
1. Biology, Life Sciences, Environmental Science.
2. Business, Administration, Finance, Economics.
3. Chemistry and Materials Science.
4. Engineering, Computer Science, Mathematics - taken as 'IT'-related.
5. Medicine, Pharmacology, Veterinary Science.
6. Physics, Astronomy, Planetary Science.
7. Social Sciences, Arts, Humanities.
Finally click on 'lookup' button on the top-right.
Results come as a summary (with h-index etc.), or - if requested via the
top-left 'File' menu - also as BibTex entries on a given textual file.
PoP seems much more liberal, and the raw data is provided by Google
Scholar. The first ca. 20??% of the cited papers look OK, but the
last 20% is under any acceptable quality limit. In addition comes
hords of duplicates, trivial textual errors, and even some plain garbage.
To check the actual precision of PoP, we have taken some
PoP-reported citations for two Norwegian SE researchers:
'DIK Sjoberg' for Dag Ingar Kondrup Sjøberg, Ifi, UiO.
'R Conradi' for Reidar Conradi, IDI, NTNU.
See files:
pop-sjoberg-all.tex 109 papers, all seven fields;
testing duplicates, errors etc.
pop-sjoberg-excl-chem-mat.tex 97 papers, all except Chem&Materials;
testing field impact.
pop-sjoberg-only-EngCSMath.tex 88 papers, only 'IT'-related;
testing core IT papers.
pop-conradi-all.tex 981 papers, all seven fields; !!!
ex. of non-IT colleagues.
pop-conradi-only-EngCSMath.tex 337 papers, only 'IT'-related;
must filter non-IT papers of
'R Conradi' to get 297 real ones.
DB.4 GoogleScholar
==================
URL: http//www.googlescholar.com, free to use, all fields.
Immature since it includes *all* documents ever co-written
by a given person (thousands ...) and with many errors.
Not yet computing, #citations, G-index, and H-index.
DB.5 DBLP
=========
URL: http://www.informatik.uni-trier.de/~ley/db/, from Univ. Trier.
Free to use, only IT covered, good quality,
often comes with links to abstract and .pdf-file
DB.6 CiteseerX
==============
URL: http://citeseerx.ist.psu.edu/.
Free to use, only IT covered, good quality.
Mostly used for single persons and (partial) document titles.
Emphasis on giving .pdf-files.
DB.7 Arnetminer
===============
URL: http://arnetminer.org/ -- fairly new and in rapid development.
Just type in your name.
Then click on your picture in the "upper-left corner" to get a
list of your publications, placed at the bottom.
The publication list is by default sorted from newest to oldest.
Click on a paper title to get the textual contents as a .pdf-file,
often via DOI.
Above this list is a very nice "co-author star" in colors,
to display yourself in relation to your co-authors.
DB.8 Cristin (ex-Frida)
=======================
URL: http://www.ntnu.no/ub/cristin -- for Norwegian R&D institutions.
Not very impressive, as the main functionality is 'bean counting'.
Queries can only have one author name.
Ex. Cannot define general groups, only disjoint subsets of existing units.
DB.9 SU group's publication list
================================
URL: http://www.idi.ntnu.no/grupper/su/INT-PUBL.php3.
Free to use, only insiders can enter data.
As *one* textfile written in .php3, with meta-symbols and .pdf files.
Special file for the ca. 150 PhD theses since 1970.
Functionality: display papers for given year (default is current year).
display papers according to document type and year.
display papers according to search terms and AND or OR-op
(e.g. 'Reidar Conradi journal 2011').
Over 1000 entries, but need another platform and wrapping!
DB.10 SCOPUS
============
URL: http://www.scopus.com/home.url
Free to use, only insiders can enter data.??
Mostly life and physical sciences
3. Concluding remarks
=====================
Many and diverse databaser for academic papers!!
If only journal papers are requested: choose ISI.
If only ``IT'' coverage is needed, choose DBLP.
If comprehensive coverage and G/H indices are needed: choose MAR.
If comprehensive coverage is needed, with G/H indices: Arnetminer/MAR.
PO that builds on GoogleScholar is pre-mature, see Apendix below.
Ex. Almost all TeX entries have one or several errors:
@article instead-of inproceedings + missing @proceedings and editor
@book instead-of @proceedings
Ex. Of 109 Tex-entries for Sjøberg, 27 are duplicates and 10 are garbage.
4. Appendix: Some big examples for Sjøberg and Conradi regarding PoP.
4.1 CASE-1: Some PoP-errors in the 109 papers in pop-sjoberg-all.tex
====================================================================
RC: added an 'a'-prefix (like 0000a1, 0000a2, ...) to name these papers.
Many duplicates (21 and counting ...) and much garbage (at least 7).
D1. CONTEX paper, 5 duplicates
------------------------------
@article{pop0000a1,
author = {DIK Sjøberg and JE Hannay and O Hansen and ...},
title = {A survey of controlled experiments in software engineering},
journal = {IEEE Transactions on …},
publisher = {computer.org},
url = {http://www.computer.org/portal/web/csdl/doi/10.1109/TSE.2005.97},
year = {2005},
note = {201 cites: http://scholar.google.com/scholar?
cites=9068752407816095539\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
@article{pop000a11,
author = {DIK Sjoberg and JE Hannay and ...},
title = {Vigdis By Kampenes, Amela Karahasanovic, Nils-Kristian Liborg,
Anette C. Rekdal, A Survey of Controlled Experiments in
Software Engineering},
journal = {IEEE Transactions on Software Engineering},
year = {2005},
note = {57 cites: http://scholar.google.com/scholar?
cites=3972648676543320866\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a43,
author = {DIK SJØBERG and JE HANNAY and O HANSEN and ...},
title = {V., KARAHASANOVI C, A., LIBORG, N.-K., AND C. REKDAL},
journal = {A. A survey of controlled …},
year = {2005},
note = {2 cites: http://scholar.google.com/scholar?
cites=1303181160453406441\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a55, %% Note also textual error in title.
author = {DIK Sjøberg and JE Hannay and ...},
title = {Vigdis By Kampenes, Amela Karahasanovic, Nils-Kristian Liborg,
and Anette C. Rekdal. 2005."
A Survey of Controlled Experiments in Software …},
journal = {IEEE Transactions on Software Engineering},
note = {3 cites: http://scholar.google.com/scholar?
cites=14313076272726917795\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a58,
author = {DIK Sjoberg and JE Hannay and ...},
title = {Ka mpenes, VB, Kar ahasanovic, A., Liborg, N.-K. and Rekdal,
AC 2005. A Su rvey of Controlled Experim ents in
Software Engineering},
journal = {IEEE Transactions on Software Engineering},
note = {2 cites: http://scholar.google.com/scholar?
cites=13627441494292258647\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a69,
author = {DIK Sjøberg and JE Hannay and O Hansen and
VB Kampenes and ...},
title = {A Survey of Controlled Experiments in Software},
journal = {Engineering. In IEEE …},
year = {2005},
note = {2 cites: http://scholar.google.com/scholar?
cites=6631457363046079464\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
D2. Future of Software Engineering research paper: 4 duplicates
------------------------------------
@book{pop0000a4, %% @book => @article??
author = {DIK Sjoberg and T Dyba and ...},
title = {The future of empirical methods in
software engineering research},
publisher = {computer.org},
url = {http://www.computer.org/portal/web/csdl/doi/10.1109/FOSE.2007.30},
year = {2007},
note = {88 cites: http://scholar.google.com/scholar?
cites=16645473109522102835\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a22,
author = {DIK Sjoberg and T Dyba and ...},
title = {The Future of Empirical Methods in Software Engineering Research},
journal = {… Conference on Software Engineering. IEEE Computer …},
year = {2007},
note = {19 cites: http://scholar.google.com/scholar?
cites=2744968747580700492\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a47,
author = {DIK Sjøberg and T Dybå and ...},
title = {The future of empirical methods in software engineering research.
Future of Software Engineering},
journal = {… of the 29th International Conference on …},
year = {2007},
note = {2 cites: http://scholar.google.com/scholar?
cites=6658786638492474235\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a54,
author = {DIK Sjøberg and T Dybå and ...},
title = {The Future of Empirical Methods in Software Engineering Research.
presented at Future of Software Engineering--
29th International Conference on …},
journal = {IEEE Computer Society},
year = {2007},
note = {3 cites: http://scholar.google.com/scholar?
cites=11436713780192218117\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a56,
author = {DIK Sjøberg and ...}, %% author+title: wrong text.
title = {Jørgensen. M. 2007. The Future of Empirical Methods
in Software Engineering Research},
journal = {Future of Software Engineering (FOSE'07), ed. by …},
note = {2 cites: http://scholar.google.com/scholar?
cites=15412413240314648\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
D3. Pair Programming meta-analysis paper: 2 duplicates
------------------------------------------------------
@article{pop000a29,
author = {… and T Dybå and E Arisholm and DIK Sjøberg},
title = {The effectiveness of pair programming: A meta-analysis},
journal = {Information and Software …},
publisher = {Elsevier},
url = {http://linkinghub.elsevier.com/retrieve/pii/S0950584909000123},
year = {2009},
note = {17 cites: http://scholar.google.com/scholar?
cites=7044227830545391005\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a78,
author = {… and T Dyba and E Arisholm and DIK Sjoberg},
title = {The effectiveness of pair programming},
journal = {Information and …},
publisher = {dialnet.unirioja.es},
url = {http://dialnet.unirioja.es/servlet/articulo?codigo=3003393},
year = {2009},
note = {Query date: 14.06.2011},
}
@article{pop000a94,
author = {T Dybå and E Arisholm and DIK Sjøberg and JE Hannay and ...},
title = {Studies on effectiveness},
journal = {computer.org}, %% RC: assuming Journal of IST ??
note = {Query date: 14.06.2011},
}
D4. Pair Programming - IEEE SW paper: one duplicate
---------------------------------------------------
@article{pop000a14,
author = {T Dyba and E Arisholm and DIK Sjoberg and ...},
title = {Are two heads better than one?
On the effectiveness of pair programminga},
journal = {Software, …},
publisher = {ieeexplore.ieee.org},
url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4375233},
year = {2007},
note = {48 cites: http://scholar.google.com/scholar?
cites=1267146913481237176\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a92,
author = {T DYBA and E ARISHOLM and DIK SJOBERG and JE HANNAY and ...},
title = {On the effectiveness of pair programming},
journal = {IEEE software},
publisher = {cat.inist.fr},
url = {http://cat.inist.fr/?aModele=afficheN\&cpsidt=19205363},
year = {2007},
note = {Query date: 14.06.2011},
}
D5. AQUIS paper: one duplicate
------------------------------
@article{pop000a63,
author = {… and DIK Sjøberg},
title = {A simple effort prediction interval method},
journal = {Proceedings of Achieving Quality in Information …},
year = {2002},
note = {2 cites: http://scholar.google.com/scholar?
cites=12271826550010681096\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a64,
author = {… and DIK Sjøberg},
title = {A Simple Effort Prediction Interval Approach},
journal = {Achieving Quality in Information Systems (AquIS)},
year = {2002},
note = {2 cites: http://scholar.google.com/scholar?
cites=7105812472919430453\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
D6. POS-9 proceedings: one duplicate
------------------------------------
@book{pop000a74, %% author => editor
author = {… and A Dearle and DIK Sjøberg},
title = {Persistent object systems: design, implementation, and use:
9th International Workshop, POS-9, Lillehammer, Norway,
September 6-8, 2000: revised papers},
publisher = {books.google.com},
year = {2001},
note = {Query date: 14.06.2011},
}
@article{pop000a91, %% title has errors??
author = {… and A Dearle and DIK Sjøberg},
title = {POS-9: persistenet object systems: design, implementation,
and use:(Lillehammer, revised papers)},
journal = {Lecture notes in computer science},
publisher = {cat.inist.fr},
url = {http://cat.inist.fr/?aModele=afficheN\&cpsidt=65164},
year = {2001},
note = {Query date: 14.06.2011},
}
D7. Basili SE protocol paper: one duplicate
-------------------------------------------
@article{pop000a35,
author = {VR Basili and MV Zelkowitz and DIK Sjøberg and ...},
title = {Protocols in the use of empirical software engineering artifacts},
journal = {Empirical Software …},
publisher = {Springer},
url = {http://www.springerlink.com/index/k0508k4648004760.pdf},
year = {2007},
note = {10 cites: http://scholar.google.com/scholar?
cites=6562074275962780612\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@book{pop000a80, %% book/editor??
author = {VR Basili and MV Zelkowitz and DIK Sjøberg and P Johnson and ...},
title = {Empir Software Eng DOI 10.1007/s10664-006-9030-4
Protocols in the use of empirical software engineering artifacts},
publisher = {Citeseer},
url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.66.4027},
year = {2008},
note = {Query date: 14.06.2011},
}
D8. Guide to advanced ESE: one duplicate
----------------------------------------
@book{pop000a18, %% author => editor
author = {… and J Singer and DIK Sjøberg},
title = {Guide to advanced empirical software engineering}, %% advanced??
publisher = {books.google.com},
year = {2007},
note = {26 cites: http://scholar.google.com/scholar?
cites=16889680598587760632\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop00a107,
author = {… and Janice. Singer and DIK Sjøberg},
title = {Guide to Advanced Empirical Software Engineering},
journal = {Springer-Verlag London},
note = {Query date: 14.06.2011},
}
D.9 Software constraint models: one duplicate
---------------------------------------------
@article{pop00a104,
author = {DIK Sjøberg},
title = {Tittel: Software constraint models Undertittel: a means to
improve maintainability and consistency Publisert år: 1994
Dokumenttype: Artikkel Språk: Engelsk},
journal = {duo.uio.no},
url = {http://www.duo.uio.no/sok/work.html?WORKID=89934},
note = {Query date: 14.06.2011},
}
@article{pop00a106,
author = {DIK Sjøberg},
title = {Software Constraint Models–A Means to
Improve Maintainability and Consistency},
journal = {Citeseer},
note = {Query date: 14.06.2011},
}
D.10 Thesaurus-based methodologies ...: one duplicate
-----------------------------------------------------
@book{pop0000a9, %% book is his PhD thesis??
author = {DIK Sjøberg},
title = {Thesaurus-based methodologies and tools for
maintaining persistent application systems},
publisher = {University of Glasgow},
year = {1993},
note = {20 cites: http://scholar.google.com/scholar?
cites=14572592812034530942\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a41,
author = {DIK Sjøberg and MP Atkinson and ...},
title = {Thesaurus-based software environments},
journal = {The Intersection between …},
publisher = {Citeseer},
year = {1994},
note = {5 cites: http://scholar.google.com/scholar?
cites=180172948683811808\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a95,
author = {DIK Sjøberg and MP Atkinson and ...},
title = {Tittel: Thesaurus-based software environments Publisert år: 1994 Dokumenttype: Artikkel Språk: Engelsk},
journal = {duo.uio.no},
url = {http://www.duo.uio.no/sok/work.html?WORKID=89935},
note = {Query date: 14.06.2011},
}
D11. Tichy-shared maintenance paper: one duplicate
--------------------------------------------------
@article{pop000a21,
author = {M VokÃ¡Ä and W Tichy and DIK Sjøberg and E Arisholm and ...},
title = {A controlled experiment comparing the maintainability of
programs designed with and without design
patterns—a replication in a real programming environment},
journal = {Empirical Software …},
publisher = {Springer},
url = {http://www.springerlink.com/index/m370523qvm4489h4.pdf},
year = {2004},
note = {33 cites: http://scholar.google.com/scholar?
cites=15494631168699208001\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a99,
author = {W Tichy and DIK Sjøberg and E Arisholm and ...},
title = {A Controlled Experiment Comparing the Maintainability of
Programs Designed With And Without Design Patterns:
A Replication In A Real Programming Environmentâ€, …},
journal = {Empirical Software …},
publisher = {Citeseer},
url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.143.3325},
year = {2004},
note = {Query date: 14.06.2011},
}
D13. SPIQ reflection paper for EWSPT'03: one duplicate.
-------------------------------------------------------
@article{pop000a46,
author = {R Conradi and T Dybå and DIK Sjøberg and ...},
title = {Lessons learned and recommendations from two large
norwegian SPI programmes},
journal = {… process technology: 9th …},
publisher = {books.google.com},
url = {http://books.google.com/books?hl=en\&lr=\&id=RooizOPAQX8C\&
oi=fnd\&pg=PA32\&dq=DIK+Sjoberg\&ots=RsJWA7UhZQ\&
sig=tTV_2jy3I9pBIz8kdxcnbtKkcII},
year = {2003},
note = {3 cites: http://scholar.google.com/scholar?
cites=6917046709958087706\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop00a103,
author = {RCT Dybå and DIK Sjøberg and ...},
title = {Lessons Learned and Recommendations from
Two Large Norwegian SPI Programmes},
journal = {Software Process Technology},
publisher = {Springer},
url = {http://www.springerlink.com/index/h61mxacxmk3f1y7a.pdf},
year = {2003},
note = {Query date: 14.06.2011},
}
D13. Code smell paper: one duplicate
------------------------------------
@article{pop000a61,
author = {… and DS Cruzes and DIK Sjoberg},
title = {Are all code smells harmful},
journal = {A study of God Classes and Brain Classes in …},
year = {2010},
note = {2 cites: http://scholar.google.com/scholar?
cites=8517321368272658612\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a66,
author = {… and DS Cruzes and DIK Sjoberg},
title = {Are all code smells harmful? A study of God Classes
and Brain Classes in the evolution of three open source systems},
journal = {… (ICSM), 2010 IEEE …},
publisher = {ieeexplore.ieee.org},
url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5609564},
year = {2010},
note = {2 cites: http://scholar.google.com/scholar?
cites=16245556814282503450\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
D14. Theory Use paper: one duplicate
------------------------------------
D15. ?? paper: one duplicate
----------------------------
D16. ?? paper: one duplicate
----------------------------
-- and there may be more duplicates!!
Errors and Garbage
==================
G1. Sjøberg's CV as a paper: one paper %% crazy!!
--------------------------------------
@article{pop000a37,
author = {DIK Sjøberg},
title = {received the MSc degree in computer science from the
University of Oslo in 1987 and the PhD degree in
computing science from the University of …},
journal = {Empirical Software Enineering},
note = {2 cites: http://scholar.google.com/scholar?
cites=5736062937442514044\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
G2. Title is wrong: mixed authors and numbers, submitted to, misc.
- at least six papers
------------------------------------------------------------
@article{pop000a77,
author = {… and SK Shrivastava and DIK Sjoberg and ...},
title = {Anfindsen, Ole J. 215 Atkinson, Malcolm v, 1, 235,307,
335 Berman, S. 171, 250 Blackburn, Stephen M. 37, 215, 259, 363},
journal = {… in persistent object …},
publisher = {Morgan Kaufmann Pub},
year = {1999},
note = {Query date: 14.06.2011},
}
@article{pop000a49,
author = {DIK Sjøberg and PC Philbrow and C Waite and ...},
title = {Build management in database programming language environments},
journal = {Submitted to: 6th International …},
year = {1995},
note = {2 cites: http://scholar.google.com/scholar?
cites=399355406703947029\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
}
@article{pop000a76, %% a7 = a48
author = {… and T Dybå and DIK Sjøberg and JE Hannay and
DIK Sjøberg and ...},
title = {ARE ENGINEERING},
journal = {ieeexplore.ieee.org},
url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4052582},
note = {Query date: 14.06.2011},
%% Double author by DIK Sjøberg??
%% JE Hannay, DIK Sjøberg, T. Dyba:
%% 'A Systematic Review of Theory Use in Software Engineering
%% Experiments',
%% IEEE Transactions on Software Engineering, Feb. 2007,
%% 33(2):87-107, DOI:10.1109/TSE.2007.12.
}
@book{pop000a85, %% @book => which? @article
author = {… and G Brunet and M Chechik and BCD Anda and
DIK Sjøberg and ...},
title = {ARE ENGINEERING},
publisher = {ieeexplore.ieee.org},
url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5061639},
year = {2009},
note = {Query date: 14.06.2011},
%% author = {G Brunet and M Chechik and BCD Anda and
DIK Sjøberg and ...:},
%% 'xxx',
%% IEEE Transactions on Software Engineering, Feb. 20??
%% xx(x):xx-xxx, DOI:??.
}
@book{pop000a87, %% @book => @article
author = {… and JE Hannay and E Arisholm and
H Engvik and DIK Sjøberg and ...},
title = {ARE ENGINEERING},
publisher = {ieeexplore.ieee.org},
url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5401362},
year = {2010},
note = {Query date: 14.06.2011},
}
@article{pop000a96,
author = {D Wang and FB Bastani and IL Yen and DIK Sjøberg and ...},
title = {ARE ENGINEERING},
journal = {117.55.241.},
url = {http://117.55.241.6/library/ieee/2005/Software%20Engineering/
Vol.%2031%20Issue%209/table%20of%20contents.pdf},
note = {Query date: 14.06.2011},
}
%%%%%%%%%%%%%%%%%%%%%%%% PoP-errors from pop-sjoberg-all.tex
4.2 CASE-2: The missing 12 PoP-papers in pop-sjoberg-excl-chem-mat.tex
======================================================================
pop-sjoberg-all.tex has 109 papers, named X1-X12.
pop-sjoberg-excl-chem-mat.tex has 97 papers, so where are the other 12?
and none of these deals with Chemistry or Materials Science!
X1. @article{pop000a11, =a1 in pop*.tex
author = {DIK Sjoberg and JE Hannay and ...},
title = {Vigdis By Kampenes, Amela Karahasanovic, Nils-Kristian Liborg,
Anette C. Rekdal, A Survey of Controlled Experiments in
Software Engineering},
journal = {IEEE Transactions on Software Engineering},
year = {2005},
note = {57 cites: http://scholar.google.com/scholar?
cites=3972648676543320866\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
%% Generally OK.
}
X2. @article{pop000a38, %% @article => @inproceedings
%% (+some @proceedings & editor)
author = {… and B Anda and M Jørgensen and DIK Sjøberg},
title = {Guidelines on Conducting Software Process Improvement
Studies in Industry},
journal = {Seminar in Scandinavia ( …},
publisher = {Citeseer},
year = {1999},
note = {9 cites: http://scholar.google.com/scholar?
cites=12426118134991045755\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
%% author = {Erik Arisholm, Bente Anda, Magne Jørgensen and
%% Dag I.K. Sjøberg},
%% title = {Guidelines on Conducting Software Process Improvement
%% Studies in Industry},
%% book = {Proc. 22th Information Research Seminar in Scandinavia (IRIS)},
%% editor = {??},
%% publisher = {??},
%% year = {1999},
%% where = {Jyvaeskylae??, Finland},
%% pages = {??},
}
X3. @article{pop000a55, %% = X1 =a11 above, = a1 elsewhere
author = {DIK Sjøberg and JE Hannay and ...},
title = {Vigdis By Kampenes, Amela Karahasanovic, Nils-Kristian Liborg,
and Anette C. Rekdal. 2005." A Survey of Controlled Experiments
in Software …},
journal = {IEEE Transactions on Software Engineering},
note = {3 cites: http://scholar.google.com/scholar?
cites=14313076272726917795\&as_sdt=2005\&sciodt=0,5\&hl=en\&num=100},
note = {Query date: 14.06.2011},
%% Almost OK.
}
X4. @article{pop000a76, = a7
author = {… and T Dybå and DIK Sjøberg and JE Hannay and
DIK Sjøberg and ...},
title = {ARE ENGINEERING},
journal = {ieeexplore.ieee.org},
url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4052582},
note = {Query date: 14.06.2011},
%% author = {JE Hannay and DIK Sjoberg and T Dyba},%% Note author sequence
%% title = {A systematic review of Theory Use in software engineering
%% experiments},
%% IEEE TSE, 13(2):87-107, Feb. 2007.
%% NB: Preceeding paper:
%% E. Arisholm, H. Gallis, T. Dybå, and D.I.K. Sjøberg:
%% Evaluating Pair Programming with Respect to System Complexity and
%% Programmer Expertise, on pp. 65-86 in same IEEE TSE issue!!
}
X5. @book{pop000a79, %% @book => @proceedings, author => editor
author = {… and AL Opdahl and DIK Sjøberg},
title = {Proceedings of NWPER'2000: Nordic Workshop on Programming
Environment Research: Lillehammer, Norway, May 28-30, 2000},
publisher = {University of Bergen, Dept. of …},
year = {2000},
note = {Query date: 14.06.2011},
}
X6. @article{pop000a84,
author = {DIK Sjøberg},
title = {Tittel: Quantifying schema evolution;
Publisert år: 2009 Dokumenttype: Artikkel Språk: Norsk Bokmål},
journal = {duo.uio.no},
url = {http://www.duo.uio.no/sok/work.html?WORKID=89933\&lang=no},
note = {Query date: 14.06.2011},
%% journal = {Information and Software Technology}, 35(1):35-44 (1993).
}
X7. @book{pop000a85, %% @book => @article
author = {… and G Brunet and M Chechik and BCD Anda and
DIK Sjøberg and ...},
title = {ARE ENGINEERING},
publisher = {ieeexplore.ieee.org},
url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5061639},
year = {2009},
note = {Query date: 14.06.2011},
%% B.C.D. Anda, D.I.K. Sjøberg, and A. Mockus:
%% Variability and Reproducibility in Software Engineering:
%% A Study of Four Companies that Developed the Same System
%% IEEE TSE, 35(3):407-429, May-June 2009.
%% DOI: 10.1109/TSE.2009.37.
%% Note preceeding paper in same issue:
%% Sebastián Uchitel, Greg Brunet, Marsha Chechik:
%% Synthesis of Partial Behavior Models from Properties and Scenarios.
%% IEEE TSE, 35(3):384-406 (2009).
}
X8. @book{pop000a87, = a28, %% @book => @article
author = {… and JE Hannay and E Arisholm and H Engvik and
DIK Sjøberg and ...},
title = {ARE ENGINEERING},
publisher = {ieeexplore.ieee.org},
url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5401362},
year = {2010},
note = {Query date: 14.06.2011},
%% author ={J.E. Hannay and E. Arisholm and H. Engvik and D.I.K. Sjøberg},
%% title = {Effects of personality on pair programming},
%% IEEE TSE 36(1):61-80, Jan.-Feb. 2010.
}
X9. @book{pop000a89, %% @book => @inproceedings
%% (+ some @proceedings + editor)
author = {DIK Sjøberg and MP Atkinson and ...},
title = {Managing change in persistent object systems},
publisher = {en.scientificcommons.org},
url = {http://en.scientificcommons.org/42169124},
year = {1993},
note = {Query date: 14.06.2011},
%% author = {Malcolm P. Atkinson, Dag I. K. Sjøberg, Ronald Morrison},
%% Managing Change in Persistent Object Systems.
%% In Shojiro Nishio, Akinori Yonezawa (Eds.): Object Technologies for
%% Advanced Software, First JSSST International Symposium, Kanazawa,
%% Japan, November 4-6, 1993, Proceedings. Lecture Notes in Computer
%% Science 742, Springer Verlag, 1993, ISBN 3-540-57342-9, pp. 315-338.
@inproceedings{DBLP:conf/isotas/AtkinsonSM93,
author = {Malcolm P. Atkinson and
Dag I. K. Sj{\o}berg and
Ronald Morrison},
title = {Managing Change in Persistent Object Systems},
booktitle = {ISOTAS},
year = {1993},
pages = {315-338},
ee = {http://dx.doi.org/10.1007/3-540-57342-9_81},
crossref = {DBLP:conf/isotas/1993},
bibsource = {DBLP, http://dblp.uni-trier.de}
@proceedings{DBLP:conf/isotas/1993,
editor = {Shojiro Nishio and
Akinori Yonezawa},
title = {Object Technologies for Advanced Software, First JSSST
International
Symposium, Kanazawa, Japan, November 4-6, 1993,
Proceedings},
booktitle = {ISOTAS},
publisher = {Springer},
series = {Lecture Notes in Computer Science},
volume = {742},
year = {1993},
isbn = {3-540-57342-9},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
X10. @article{pop000a96, %% remove this entry!!
author = {D Wang and FB Bastani and IL Yen and DIK Sjøberg and ...},
title = {ARE ENGINEERING},
journal = {117.55.241.},
url = {http://117.55.241.6/library/ieee/2005/Software%20Engineering/
Vol.%2031%20Issue%209/table%20of%20contents.pdf}, %% ??
note = {Query date: 14.06.2011},
%% D. Wang, F.B. Bastani, I.-L. Yen:
%% Automated aspect-oriented decomposition of process-control
%% systems for ultra-high dependability assurance.
%% IEEE TSE, 31(9):713-732, Sept. 2005.
%% But succeeding paper in this issue has DIK Sjøberg as first author: !!
%% Dag I. K. Sjøberg, Jo Erskine Hannay, Ove Hansen, Vigdis By Kampenes,
%% Amela Karahasanovic, Nils-Kristian Liborg, Anette C. Rekdal: A
%% Survey of Controlled Experiments in Software Engineering.
%% IEEE TSE, 31(9): 733-753 (2005).
}
X11. @article{pop000a98, %% @article => @inproceedings
%% (+ some @proceedings & author=NN elsewhere)
author = {DIK Sjøberg and MP Atkinson and J Lopes and ...},
title = {Tittel: Building an integrated persistent application
Publisert år: 1993 Dokumenttype: Konferansebidrag Språk: Engelsk},
journal = {duo.uio.no},
url = {http://www.duo.uio.no/sok/work.html?WORKID=89912\&lang=no},
note = {Query date: 14.06.2011},
%% Dag I. K. Sjøberg, Malcolm P. Atkinson, João Lopes, Philip W. Trinder:
%% Building an Integrated Persistent Application.
%% Proc. Fourth International Workshop on
%% Database Programming Languages - Object
%% Models and Languages (DBPL), 1993, pp. 359-375.
}
X12. @article{pop00a100, = X9
author = {DIK Sjøberg and MP Atkinson and ...},
title = {Tittel: Managing change in persistent object systems
Publisert år: 1993 Dokumenttype: Konferansebidrag Språk: Engelsk},
journal = {duo.uio.no},
url = {http://www.duo.uio.no/sok/work.html?WORKID=89932},
note = {Query date: 14.06.2011},
}
end %%%%%%%%%%%%%%%%%% subset of pop-sjoberg-all.tex 14.06.2011
4.3 CASE-3: Prune Conradi's 337=>297 papers on pop-conradi-EngCSMath.tex
========================================================================
File: pop-conradi-onlyEngCSMath.tex 15.06.2011
Query: R Conradi, onlyEngCSMath fields ('IT').
Summary:
Result: 337 seemingly valid bibtex entries.
285 R Conradi => C cc.
1 COnradi (no182 has a capital 'O')!
4 added in author list: (no137, no246, no282, no290)
without an explicit C cc.
7 of R CONRADI => R RRRRRRR (all in capital letters).
297 correct ones!!
Deducted as being 'other' Conradi-persons: -- insert a filter for these?
36 * Conradi => * OtherXX (* : 'R' or '')
4 * CONRADI => * OTHERXX (similar)
(In 5 URLs: *R+Conradi => X+XX, but not making extra bib-entry)
Note: cannot say 'R Conradi' and get only one 'R',
without also getting '*R', i.e. 'R'.
Plus the usual number of inconsistencies!!
end %%%%%%%%%%%%%%%%%% pop-conradi-EngCSMath.tex (IT) 14.06.2011