General:
Compatibility:
How-to:
Problems:
Getting help:
No.
Explanation: During the transition from CVS to Subversion, it would sometimes be useful to have the new Subversion repository track activity in the CVS repository for a period of time until the final switchover. This would require each conversion to determine what had changed in CVS since the last conversion, and add those commits on top of the Subversion repository.
Unfortunately, cvs2svn/cvs2git does not support incremental conversions. With some work it would be possible to add this feature, but it would be difficult to make it robust. The trickiest problem is that CVS allows changes to the repository that have retroactive effects (e.g., affecting parts of the history that have already been converted).
Some conversion tools claim to support incremental conversions from CVS, but as far as is known none of them are reliable.
Volunteers or sponsorship to add support for incremental conversions to cvs2svn/cvs2git would be welcome.
No.
Explanation: Psyco is a python extension that can speed up the execution of Python code by compiling parts of it into i386 machine code. Unfortunately, Psyco is known not to run cvs2svn correctly (this was last tested with the Psyco pre-2.0 development branch). When cvs2svn is run under Psyco it crashes in OutputPass with an error message that looks something like this:
cvs2svn_lib.common.InternalError: ID changed from 2 -> 3 for Trunk, r2
The Psyco team has been informed about the problem.
cvs2svn requires direct, filesystem access to a copy of the CVS repository that you want to convert. The reason for this requirement is that cvs2svn directly parses the *,v files that make up the CVS repository.
Many remote hosting sites provide access to backups of your CVS repository, which could be used for a cvs2svn conversion. For example:
If your provider does not provide any way to download your CVS repository, there are two known tools that claim to be able to clone a CVS repository via the CVS protocol:
It should be possible to use one of these tools to fetch a copy of your CVS repository from your provider, then to use cvs2svn to convert the copy. However, the developers of cvs2svn do not have any experience with these tools, so you are on your own here. If you try one of them, please tell us about your experience on the users mailing list.
If you need to convert certain CVS modules (in one large repository) to Subversion now and other modules later, you may want to convert your repository one module at a time. This situation is typically encountered in large organizations where each project has a separate lifecycle and schedule, and a one-step conversion process is not practical.
First you have to decide whether you want to put your converted projects into a single Subversion repositories or multiple ones. This decision mostly depends on the degree of coupling between the projects and is beyond the scope of this FAQ. See the Subversion book for a discussion of repository organization.
If you decide to convert your projects into separate Subversion repositories, then please follow the instructions in How can I convert part of a CVS repository? once for each repository.
If you decide to put more than one CVS project into a single Subversion repository, then please follow the instructions in How can I convert separate projects in my CVS repository into a single Subversion repository?.
This is easy: simply run cvs2svn normally, passing it the path of the project subdirectory within the CVS repository. Since cvs2svn ignores any files outside of the path it is given, other projects within the CVS repository will be excluded from the conversion.
Example: You have a CVS repository at path /path/cvsrepo with projects in subdirectories /path/cvsrepo/foo and /path/cvsrepo/bar, and you want to create a new Subversion repository at /path/foo-svn that includes only the foo project:
$ cvs2svn -s /path/foo-svn /path/cvsrepo/foo
cvs2svn supports multiproject conversions, but you have to use the options file method to start the conversion. In your options file, you simply call run_options.add_project() once for each sub-project in your repository. For example, if your CVS repository has the layout:
/project_a /project_b
and you want your Subversion repository to be laid out like this:
project_a/ trunk/ ... branches/ ... tags/ ... project_b/ trunk/ ... branches/ ... tags/ ...
then you need to have a section like this in your options file:
run_options.add_project( 'my/cvsrepo/project_a', trunk_path='project_a/trunk', branches_path='project_a/branches', tags_path='project_a/tags', symbol_transforms=[ #...whatever... ], symbol_strategy_rules=[ #...whatever... ], ) run_options.add_project( 'my/cvsrepo/project_b', trunk_path='project_b/trunk', branches_path='project_b/branches', tags_path='project_b/tags', symbol_transforms=[ #...whatever... ], symbol_strategy_rules=[ #...whatever... ], )
The options file is Python code, executed by the Python interpreter. This makes it easy to automate parts of the configuration process. For example, to add many subprojects, you can write a Python loop:
projects = ['A', 'B', 'C', ...etc...] cvs_repo_main_dir = r'test-data/main-cvsrepos' for project in projects: run_options.add_project( cvs_repo_main_dir + '/' + project, trunk_path=(project + '/trunk'), branches_path=(project + '/branches'), tags_path=(project + '/tags'), # ... )
or you could even read the subprojects directly from the CVS repository:
import os cvs_repo_main_dir = r'test-data/main-cvsrepos' projects = os.listdir(cvs_repo_main_dir) # Probably you don't want to convert CVSROOT: projects.remove('CVSROOT') for project in projects: # ...as above...
If foo is the only project that you want to convert, then either run cvs2svn like this:
$ cvs2svn --trunk=foo/trunk --branches=foo/branches --tags=foo/tags CVSREPO/foo
or use an options file that defines a project like this:
run_options.add_project( 'my/cvsrepo/foo', trunk_path='foo/trunk', branches_path='foo/branches', tags_path='foo/tags', symbol_transforms=[ #...whatever... ], symbol_strategy_rules=[ #...whatever... ], )
If foo is not the only project that you want to convert, then you need to do a multiproject conversion; see How can I convert separate projects in my CVS repository into a single Subversion repository? for more information.
Warning: cvs2svn's handling of end-of-line options changed between version 1.5.x and version 2.0.x. This documentation applies to version 2.0.x and later. The documentation applying to an earlier version can be found in the www directory of that release of cvs2svn.
Starting with version 2.0, the default behavior of cvs2svn is to treat all files as binary except those explicitly determined to be text. (Previous versions treated files as text unless they were determined to be binary.) This behavior was changed because, generally speaking, it is safer to treat a text file as binary than vice versa.
However, it is often preferred to set svn:eol-style=native for text files, so that their end-of-file format is converted to that of the client platform when the file is checked out. This section describes how to get the settings that you want.
If a file is marked as binary in CVS (with cvs admin -kb, then cvs2svn will always treat the file as binary. For other files, cvs2svn has a number of options that can help choose the correct end-of-line translation parameters during the conversion:
--auto-props=FILE |
Set arbitrary Subversion properties on files based on the auto-props section of a file in svn config format. The auto-props file might have content like this: [auto-props] *.txt = svn:mime-type=text/plain;svn:eol-style=native *.doc = svn:mime-type=application/msword;!svn:eol-style This option can also be used in combination with --eol-from-mime-type. To force end-of-line translation off, use a setting of the form !svn:eol-style (with a leading exclamation point). |
--mime-types=FILE | Specifies an Apache-style mime.types file for setting files' svn:mime-type property based on the file extension. The mime-types file might have content like this: text/plain txt application/msword doc This option only has an effect on svn:eol-style if it is used in combination with --eol-from-mime-type. |
--eol-from-mime-type | Set svn:eol-style based on the file's mime type (if known). If the mime type starts with "text/", then the file is treated as a text file; otherwise, it is treated as binary. This option is useful in combination with --auto-props or --mime-types. |
--default-eol=STYLE | Usually cvs2svn treats a file as binary unless one of the other rules determines that it is not binary and it is not marked as binary in CVS. But if this option is specified, then cvs2svn uses the specified style as the default. STYLE can be 'binary' (default), 'native', 'CRLF', 'LF', or 'CR'. If you have been diligent about annotating binary files in CVS, or if you are confident that the above options will catch all of your binary files, then --default-style=native should give good results. |
If you don't use any of these options, then cvs2svn will not arrange any line-end translation whatsoever. The file contents in the SVN repository should be the same as the contents you would get if checking out with CVS on the machine on which cvs2svn is run. This also means that the EOL characters of text files will be the same no matter where the SVN data are checked out (i.e., not translated to the checkout machine's EOL format).
To do a better job, you can use --auto-props, --mime-types, and --eol-from-mime-type to specify exactly which properties to set on each file based on its filename.
For total control over setting properties on files, you can use the --options-file method and write your own FilePropertySetter or RevisionPropertySetter in Python. For example,
from cvs2svn_lib.property_setters import FilePropertySetter class MyPropertySetter(FilePropertySetter): def set_properties(self, cvs_file): if cvs_file.cvs_path.startswith('path/to/funny/files/'): cvs_file.properties['svn:mime-type'] = 'text/plain' cvs_file.properties['svn:eol-style'] = 'CRLF' ctx.file_property_setters.append(MyPropertySetter())
See the file cvs2svn_lib/property_setters.py for more examples.
This is an example of how the cvs2svn conversion can be customized using Python.
Suppose you want to write symbol transform rules that are more complicated than "replace REGEXP with PATTERN". This can easily be done by adding just a little bit of Python code to your options file.
When a symbol is encountered, cvs2svn iterates through the list of SymbolTransform objects defined for the project. For each one, it calls symbol_transform.transform(cvs_file, symbol_name, revision). That method can return any legal symbol name, which will be used in the conversion instead of the original name.
To use this feature, you will have to use an options file to start the conversion. You then write a new SymbolTransform class that inherits from RegexpSymbolTransform but checks the path before deciding whether to transform the symbol. Add the following to your options file:
from cvs2svn_lib.symbol_transform import RegexpSymbolTransform class MySymbolTransform(RegexpSymbolTransform): def __init__(self, path, pattern, replacement): """Transform only symbols that occur within the specified PATH.""" self.path = path RegexpSymbolTransform.__init__(self, pattern, replacement) def transform(self, cvs_file, symbol_name, revision): # Is the file is within the path we are interested in? if cvs_file.cvs_path.startswith(path + '/'): # Yes -> Allow RegexpSymbolTransform to transform the symbol: return RegexpSymbolTransform.transform( self, cvs_file, symbol_name, revision) else: # No -> Return the symbol unchanged: return symbol_name # Note that we use a Python loop to fill the list of symbol_transforms: symbol_transforms = [] for subdir in ['project1', 'project2', 'project3']: symbol_transforms.append( MySymbolTransform( subdir, r'release-(\d+)_(\d+)', r'%s-release-\1.\2' % subdir)) # Now register the project, using our own symbol transforms: run_options.add_project( 'your_cvs_path', trunk_path='trunk', branches_path='branches', tags_path='tags', symbol_transforms=symbol_transforms))
This example causes any symbol under "project1" that looks like "release-3_12" to be transformed into a symbol named "project1-release-3.12", whereas if the same symbol appears under "project2" it will be transformed into "project2-release-3.12".
CVSNT is a version control system that started out by adding support for running CVS under Windows NT. Since then it has made numerous extensions to the RCS file format, to the point where CVS compatibility does not imply CVSNT compatibility with any degree of certainty.
cvs2svn might happen to successfully convert a CVSNT repository, especially if the repository has never had any CVSNT-only features used on it, but this use is not supported and should not be expected to work.
If you want to experiment with converting a CVSNT repository, then please consider the following suggestions:
Patches to support the conversion of CVSNT repositories would, of course, be welcome.
Attempting to run cvs2svn on a standard OS X 10.5.5 installation yields the following error:
ERROR: cvs2svn uses the anydbm package, which depends on lower level dbm libraries. Your system has dbm, with which cvs2svn is known to have problems. To use cvs2svn, you must install a Python dbm library other than dumbdbm or dbm. See http://python.org/doc/current/lib/module-anydbm.html for more information.
The problem is that the standard distribution of python on OS X 10.5.5 does not include any other dbm libraries other than the standard dbm. In order for cvs2svn to work, we need to install the gdbm library, in addition to a new version of python that enables the python gdbm module.
The precompiled versions of python for OS X available from
python.org or activestate.com (currently version 2.6.2) do not have
gdbm support turned on. To check for gdbm support, check for the
library module (libgdmmodule.so
) within the python
installation.
Here is the procedure for a successful installation of cvs2svn and all supporting libs:
./configure
BINOWN = bin BINGRP = binto
BINOWN = root BINGRP = admin
#gdbm gdbmmodule.c -I/usr/local/include -L/usr/local/lib -lgdbmto
gdbm gdbmmodule.c -I/usr/local/include -L/usr/local/lib -lgdbm
#*shared*to
*shared*
./configure --enable-framework
--enable-universalsdk
in the top-level
Python2.6 directory. This will configure the installation of
python as a shared OS X framework, and usable with OS X GUI
frameworks and SDKs. You may have problems building if you don't
have the SDKs that support the PPC platform. If you do, just
specify --disable-universalsdk
.
By default, python will be installed in
"/Library/Frameworks/Python.framework", which is what we
want.make
sudo make install
cd /usr/local/bin; sudo ln -s python2.6 python
source ~/.profle
or source
~/.bashrc
etc. or alternatively, just open a new shell
window. When you type which python
it should give
you the new version in "/usr/local/bin" not the
one in "/usr/bin".sudo make install
sudo ln -s /Library/Frameworks/Python.framework/Versions/2.6/bin/cvs2svn /usr/local/bin/cvs2svn sudo ln -s /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6 /usr/local/lib/python2.6 sudo ln -s /Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 /usr/local/include/python2.6
The installation is complete. Change directory out of the cvs2svn-2.2.0 installation directory, and you should be able to run cvs2svn. Be careful *not* to copy the version of cvs2svn in the cvs2svn-2.2.0 installation directory to /usr/local/bin, as this has a different python environment setting at the top of the file than the one that was installed in the /Library/Frameworks/Python.framework hierarchy. Follow the instructions exactly, and it should work.
Background: Normally, if you have a file called path/file.txt in your project, CVS stores its history in a file called repo/path/file.txt,v. But if file.txt is deleted on the main line of development, CVS moves its history file to a special Attic subdirectory: repo/path/Attic/file.txt,v. (If the file is recreated, then it is moved back out of the Attic subdirectory.) Your repository should never contain both of these files at the same time.
This cvs2svn error message thus indicates a mild form of corruption in your CVS repository. The file has two conflicting histories, and even CVS does not know the correct history of path/file.txt. The corruption was probably created by using tools other than CVS to backup or manipulate the files in your repository. With a little work you can learn more about the two histories by viewing each of the file.txt,v files in a text editor.
There are four straightforward approaches to fixing the repository corruption, but each has potential disadvantages. Remember to make a backup before starting. Never run cvs2svn on a live CVS repository--always work on a copy of your repository.
# You did make a backup, right? $ rm repo/path/Attic/file.txt,v
# You did make a backup, right? $ rm repo/path/file.txt,v
# You did make a backup, right? $ mv repo/path/file.txt,v repo/path/file-not-from-Attic.txt,v
If you run cvs2svn on a case-insensitive operating system, it is possible to get this error even if the filename of the file in Attic has different case than the one out of the Attic. This could happen, for example, if the CVS repository was served from a case-sensitive operating system at some time. A workaround for this problem is to copy the CVS repository to a case-sensitive operating system and convert it there.
The named file is corrupt in some way. (Corruption is surprisingly common in CVS repositories.) It is likely that even CVS has problems with this file; try checking out the head revision, revision 1.1, and the tip revision on each branch of this file; probably one or more of them don't work.
Here are some options:
This has been reported to be caused by trying to create gdbm databases on an NFS partition. Apparently gdbm does not support databases on NFS partitions. The workaround is to use the --tmpdir option to choose a local partition for cvs2svn to write its temporary files.
Some Macintosh CVS clients use a nonstandard trick to store the resource fork of files in CVS: instead of storing the file contents directly, store an AppleSingle data stream containing both the data fork and resource fork. When checking the file out, the client unpacks the AppleSingle data and writes the two forks separately to disk. By default, cvs2svn treats the file contents literally, so when you check the file out of Subversion, the file contains the combined data in AppleSingle format rather than only the data fork of the file as expected.
Subversion does not have any special facilities for dealing with Macintosh resource forks, so there is nothing cvs2svn can do to preserve both forks of your data. However, sometimes the resource fork is not needed. If you would like to discard the resource fork and only record the data fork in Subversion, then start your conversion using the options file method and set the following option to True in your options file:
ctx.decode_apple_single = True
There is more information about this option in the comments in cvs2svn-example.options.
What are you using cvs2svn version 1.3.x for anyway? Upgrade!
But if you must, either install RCS, or ensure that CVS is installed and use cvs2svn's --use-cvs option.
Normally, people using "cvs import" don't specify the "-b" flag. cvs2svn handles this normal case fine.
If you have a file which has an active vendor branch, i.e. there have never been any trunk commits but only "cvs imports" onto the vendor branch, then cvs2svn will handle this fine. (Even if you've used the "-b" option to specify a non-standard branch number).
If you've used "cvs import -b <branch number>", you didn't specify the standard CVS vendor branch number of 1.1.1, and there has since been a commit on trunk (either a modification or delete), then your history has been damaged. This isn't cvs2svn's fault. CVS simply doesn't record the branch number of the old vendor branch, it assumes it was 1.1.1. You will even get the wrong results from "cvs checkout -D" with a date when the vendor branch was active.
Symptoms of this problem can include:
(Note: There are other possible causes for these symptoms, don't assume you have a non-standard vendor branch number just because you see these symptoms).
The way to solve this problem is to renumber the vendor branch to the standard 1.1.1 branch number. This has to be done before you run cvs2svn. To help you do this, there is the "renumber_branch.py" script in the "contrib" directroy of the cvs2svn distribution.
The typical usage, assuming you used "cvs import -b 1.1.2 ..." to create your vendor branch, is:
contrib/renumber_branch.py 1.1.2 1.1.1 repos/dir/file,v
You should only run this on a copy of your CVS repository, as it edits the repository in-place. You can fix a single file or a whole directory tree at a time.
The script will check that the 1.1.1 branch doesn't already exist; if it does exist then it will fail with an error message.
There are several sources of help for cvs2svn:
If you ask for help and/or report a bug on a mailing list, it is important that you include the following information. Failure to include important information is the best way to dissuade the volunteers of the cvs2svn project from trying to help you.
It is not so obvious how to subscribe to the cvs2svn mailing lists. There are two ways:
cvs2svn is an open source project that is largely developed and supported by volunteers in their free time. Therefore please try to help out by reporting bugs in a way that will enable us to help you efficiently.
The first question is whether the problem you are experiencing is caused by a cvs2svn bug at all. A large fraction of reported "bugs" are caused by problems with the user's CVS repository, especially mild forms of repository corruption or trying to convert a CVSNT repository with cvs2svn. Please also double-check the manual to be sure that you are using the command-line options correctly.
A good way to localize potential repository corruption is to use the shrink_test_case.py script (which is located in the contrib directory of the cvs2svn source tree). This script tries to find the minimum subset of files in your repository that still shows the same problem. Warning: Only apply this script to a backup copy of your repository, as it destroys the repository that it operates on! Often this script can narrow the problem down to a single file which, as often as not, is corrupt in some way. Even if the problem is not in your repository, the shrunk-down test case will be useful for reporting the bug. Please see "How can I produce a useful test case?" and the comments at the top of shrink_test_case.py for information about how to use this script.
Assuming that you still think you have found a bug, the next step is to investigate whether the bug is already known. Please look through the issue tracker for bugs that sound familiar. If the bug is already known, then there is no need to report it (though possibly you could contribute a useful test case or a workaround).
If your bug seems new, then the best thing to do is report it via email to the [email protected] mailing list. Be sure to include the information listed in "What information should I include when requesting help?"
If you need to report a bug, it is extremely helpful if you can include a test repository with your bug report. In most cases, if we cannot reproduce the problem, there is nothing we can do to help you. This section describes ways to overcome the most common problems that people have in producing a useful test case. When you have a reasonable-sized test case (say under 1 MB--the smaller the better), you can just tar it up and attach it to the email in which you report the bug.
You don't want to send us your proprietary information, and we don't want to receive it either. Short of open-sourcing your software, here is a way to strip out most of the proprietary information and simultaneously reduce the size of the archive tremendously.
The destroy_repository.py script tries to delete as much information as possible out of your repository while still preserving its basic structure (and therefore hopefully any cvs2svn bugs). Specifically, it tries to delete file descriptions, text content, all nontrivial log messages, and all author names. It also renames all files and directories to have generic names (e.g., dir015/file053,v). (It does not affect the number and dates of revisions to the files.)
# You did make a backup, right? /path/to/config/destroy_repository.py /path/to/copy/of/repo
If running destroy_repository.py with its default options causes the bug to go away, consider using destroy_repository.py command-line options to leave part of the repository information intact. Run destroy_repository.py --help for more information.
This step is a tiny bit more work, so if your repository is already small enough to send you can skip this step. But this step helps narrow down the problem (maybe even point you to a corrupt file in your repository!) so it is still recommended.
The shrink_test_case.py script tries to delete as many files and directories from your repository as possible while preserving the cvs2svn bug. To use this command, you need to write a little test script that tries to convert your repository and checks whether the bug is still present. The script should exit successfully (e.g., "exit 0") if the bug is still present, and fail (e.g., "exit 1") if the bug has disappeared. The form of the test script depends on the bug that you saw, but it can be as simple as something like this:
#! /bin/sh cvs2svn --dry-run /path/to/copy/of/repo 2>&1 | grep -q 'KeyError'
If the bug is more subtle, then the test script obviously needs to be more involved.
Once the test script is ready, you can shrink your repository via the following steps:
# You did make a backup, right? /path/to/config/shrink_test_case.py /path/to/copy/of/repo testscript.sh, where testscript.sh is the name of the test script described above. This script will execute testscript.sh many times, each time using a subset of the original repository.
Disclaimer:These links in this section are provided as a service to cvs2svn/cvs2git users. Neither Tigris.org, CollabNet Inc., nor the cvs2svn team guarantee the correctness, validity or usefulness of these links. To add a link to this section, please submit it to the cvs2svn developers' mailing list.
Following is a list of known sources for commercial support for cvs2svn/cvs2git conversions: